File size: 6,116 Bytes
85b073f 1bc7e21 85b073f 1bc7e21 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 |
---
license: apache-2.0
datasets:
- GetSoloTech/Code-Reasoning
language:
- en
base_model:
- GetSoloTech/Qwen3-Code-Reasoning-4B
pipeline_tag: text-generation
tags:
- coding
- reasoning
- problem-solving
- algorithms
- python
- c++
---
# GetSoloTech/Qwen3-Code-Reasoning-4B-GGUF
This is the GGUF quantized version of the [Qwen3-Code-Reasoning-4B](https://huggingface.co/GetSoloTech/Qwen3-Code-Reasoning-4B) model, specifically optimized for competitive programming and code reasoning tasks. This model has been trained on the high-quality Code-Reasoning dataset to enhance its capabilities in solving complex programming problems with detailed reasoning.
## π Key Features
* **Enhanced Code Reasoning**: Specifically trained on competitive programming problems
* **Thinking Capabilities**: Inherits the advanced reasoning capabilities from the base model
* **High-Quality Solutions**: Trained on solutions with β₯85% test case pass rates
* **Structured Output**: Optimized for generating well-reasoned programming solutions
* **Efficient Inference**: GGUF format enables fast inference on CPU and GPU
* **Multiple Quantization Levels**: Available in various precision levels for different hardware requirements
### Dataset Statistics
* **Split**: Python
* **Source**: High-quality competitive programming problems from TACO, APPS, CodeContests, and Codeforces
* **Quality Filter**: Only correctly solved problems with β₯85% test case pass rates
## π§ Usage
### Using with llama.cpp
```bash
# Download the model (choose your preferred quantization)
wget https://huggingface.co/GetSoloTech/Qwen3-Code-Reasoning-4B-GGUF/resolve/main/qwen3-code-reasoning-4b.Q4_K_M.gguf
# Run inference
./llama.cpp -m qwen3-code-reasoning-4b.Q4_K_M.gguf -n 4096 --repeat_penalty 1.1 -p "You are an expert competitive programmer. Read the problem and produce a correct, efficient solution. Include reasoning if helpful.\n\nProblem: Your programming problem here..."
```
### Using with Python (llama-cpp-python)
```python
from llama_cpp import Llama
# Load the model
llm = Llama(
model_path="./qwen3-code-reasoning-4b.Q4_K_M.gguf",
n_ctx=4096,
n_threads=4
)
# Prepare input for competitive programming problem
prompt = """You are an expert competitive programmer. Read the problem and produce a correct, efficient solution. Include reasoning if helpful.
Problem: Your programming problem here..."""
# Generate solution
output = llm(
prompt,
max_tokens=4096,
temperature=0.7,
top_p=0.8,
top_k=20,
repeat_penalty=1.1
)
print(output['choices'][0]['text'])
```
### Using with Ollama
```bash
# Create a Modelfile
cat > Modelfile << EOF
FROM ./qwen3-code-reasoning-4b.Q4_K_M.gguf
TEMPLATE """{{ if .System }}<|im_start|>system
{{ .System }}<|im_end|>
{{ end }}{{ if .Prompt }}<|im_start|>user
{{ .Prompt }}<|im_end|>
{{ end }}<|im_start|>assistant
"""
PARAMETER temperature 0.7
PARAMETER top_p 0.8
PARAMETER top_k 20
PARAMETER repeat_penalty 1.1
EOF
# Create and run the model
ollama create qwen3-code-reasoning -f Modelfile
ollama run qwen3-code-reasoning "Solve this competitive programming problem: [your problem here]"
```
## π Available Quantizations
| Quantization | Size | Memory Usage | Quality | Use Case |
|--------------|------|--------------|---------|----------|
| Q3_K_M | 2.08 GB | ~3 GB | Good | CPU inference, limited memory |
| Q4_K_M | 2.5 GB | ~4 GB | Better | Balanced performance/memory |
| Q5_K_M | 2.89 GB | ~5 GB | Very Good | High quality, moderate memory |
| Q6_K | 3.31 GB | ~6 GB | Excellent | High quality, more memory |
| Q8_0 | 4.28 GB | ~8 GB | Best | Maximum quality, high memory |
| F16 | 8.05 GB | ~16 GB | Original | Maximum quality, GPU recommended |
## π Performance Expectations
This GGUF quantized model maintains the performance characteristics of the original finetuned model:
* **Competitive Programming Problems**: Better understanding of problem constraints and requirements
* **Code Generation**: More accurate and efficient solutions
* **Reasoning Quality**: Enhanced step-by-step reasoning for complex problems
* **Solution Completeness**: More comprehensive solutions with proper edge case handling
## ποΈ Recommended Settings
### For Code Generation
* **Temperature**: 0.7
* **Top-p**: 0.8
* **Top-k**: 20
* **Max New Tokens**: 4096 (adjust based on problem complexity)
* **Repeat Penalty**: 1.1
### For Reasoning Tasks
* **Temperature**: 0.6
* **Top-p**: 0.95
* **Top-k**: 20
* **Max New Tokens**: 8192 (for complex reasoning)
* **Repeat Penalty**: 1.1
## π οΈ Hardware Requirements
### Minimum Requirements
* **RAM**: 4 GB (for Q3_K_M quantization)
* **Storage**: 2.5 GB free space
* **CPU**: Multi-core processor recommended
### Recommended Requirements
* **RAM**: 8 GB or more
* **Storage**: 5 GB free space
* **GPU**: NVIDIA GPU with 4GB+ VRAM (optional, for faster inference)
## π€ Contributing
This GGUF model was converted from the original LoRA-finetuned model. For questions about:
* The original model: [GetSoloTech/Qwen3-Code-Reasoning-4B](https://huggingface.co/GetSoloTech/Qwen3-Code-Reasoning-4B)
* The base model: [Qwen3 GitHub](https://github.com/QwenLM/Qwen3)
* The training dataset: [Code-Reasoning Repository](https://huggingface.co/datasets/GetSoloTech/Code-Reasoning)
* The training framework: [Unsloth Documentation](https://github.com/unslothai/unsloth)
## π License
This model follows the same license as the base model (Apache 2.0). Please refer to the base model license for details.
## π Acknowledgments
* **Qwen Team** for the excellent base model
* **Unsloth Team** for the efficient training framework
* **NVIDIA Research** for the original OpenCodeReasoning-2 dataset
* **llama.cpp community** for the GGUF format and tools
## π Contact
For questions about this GGUF model, please open an issue in the repository.
---
**Note**: This model is specifically optimized for competitive programming and code reasoning tasks. The GGUF format enables efficient inference on various hardware configurations while maintaining the model's reasoning capabilities. |