Kodify-Nano-GGUF 🤖
Kodify-Nano-GGUF - GGUF версия модели MTSAIR/Kodify-Nano, оптимизированная для CPU/GPU-инференса и использованием Ollama/llama.cpp. Легковесная LLM для задач разработки кода с минимальными ресурсами.
Kodify-Nano-GGUF - GGUF version of MTSAIR/Kodify-Nano, optimized for CPU/GPU inference with Ollama/llama.cpp. Lightweight LLM for code development tasks with minimal resource requirements.
Using the Image
You can run Kodify Nano on OLLAMA in two ways:
- Using Docker
- Locally (provides faster responses than Docker)
Method 1: Running Kodify Nano on OLLAMA in Docker
Without NVIDIA GPU:
docker run -e OLLAMA_HOST=0.0.0.0:8985 -p 8985:8985 --name ollama -d ollama/ollama
With NVIDIA GPU:
docker run --runtime nvidia -e OLLAMA_HOST=0.0.0.0:8985 -p 8985:8985 --name ollama -d ollama/ollama
Important:
- Ensure Docker is installed and running
- If port 8985 is occupied, replace it with any available port and update plugin configuration
Load the model:
docker exec ollama ollama pull hf.co/MTSAIR/Kodify-Nano-GGUF
Rename the model:
docker exec ollama ollama cp hf.co/MTSAIR/Kodify-Nano-GGUF kodify_nano
Start the model:
docker exec ollama ollama run kodify_nano
Method 2: Local Kodify Nano on OLLAMA
Download OLLAMA:
https://ollama.com/downloadSet the port:
export OLLAMA_HOST=0.0.0.0:8985
Note: If port 8985 is occupied, replace it and update plugin configuration
- Start OLLAMA server:
ollama serve &
- Download the model:
ollama pull hf.co/MTSAIR/Kodify-Nano-GGUF
- Rename the model:
ollama cp hf.co/MTSAIR/Kodify-Nano-GGUF kodify_nano
- Run the model:
ollama run kodify_nano
Plugin Installation
For Visual Studio Code
- Download the latest Kodify plugin for VS Code.
- Open the Extensions panel on the left sidebar.
- Click Install from VSIX... and select the downloaded plugin file.
For JetBrains IDEs
- Download the latest Kodify plugin for JetBrains.
- Open the IDE and go to Settings > Plugins.
- Click the gear icon (⚙️) and select Install Plugin from Disk....
- Choose the downloaded plugin file.
- Restart the IDE when prompted.
Changing the Port in Plugin Settings (for Visual Studio Code and JetBrains)
If you changed the Docker port from 8985
, update the plugin's config.json
:
- Open any file in the IDE.
- Open the Kodify sidebar:
- VS Code:
Ctrl+L
(Cmd+L
on Mac). - JetBrains:
Ctrl+J
(Cmd+J
on Mac).
- VS Code:
- Access the
config.json
file:- Method 1: Click Open Settings (VS Code) or Kodify Config (JetBrains), then navigate to Configuration > Chat Settings > Open Config File.
- Method 2: Click the gear icon (⚙️) in the Kodify sidebar.
- Modify the
apiBase
port undertabAutocompleteModel
andmodels
. - Save the file (
Ctrl+S
or File > Save).
Available quantization variants:
- Kodify_Nano_q4_k_s.gguf (balanced)
- Kodify_Nano_q8_0.gguf (high quality)
- Kodify_Nano.gguf (best quality, unquantized)
Download using huggingface_hub:
pip install huggingface-hub
python -c "from huggingface_hub import hf_hub_download; hf_hub_download(repo_id='MTSAIR/Kodify-Nano-GGUF', filename='Kodify_Nano_q4_k_s.gguf', local_dir='./models')"
Python Integration
Install Ollama Python library:
pip install ollama
Example code:
import ollama
response = ollama.generate(
model="kodify-nano",
prompt="Write a Python function to calculate factorial",
options={
"temperature": 0.4,
"top_p": 0.8,
"num_ctx": 8192
}
)
print(response['response'])
Usage Examples
response = ollama.generate(
model="kodify-nano",
prompt="""<s>[INST]
Write a Python function that:
1. Accepts a list of numbers
2. Returns the median value
[/INST]""",
options={"max_tokens": 512}
)
### Code Refactoring
response = ollama.generate(
model="kodify-nano",
prompt="""<s>[INST]
Refactor this Python code:
def calc(a,b):
s = a + b
d = a - b
p = a * b
return s, d, p
[/INST]""",
options={"temperature": 0.3}
)
- Downloads last month
- 329
Hardware compatibility
Log In
to view the estimation
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Collection including MTSAIR/Kodify-Nano-GGUF
Collection
Kodify-Nano is a lightweight LLM designed for code development tasks with minimal resource usage. It is optimized for fast and efficient interaction,
•
4 items
•
Updated
•
2