File size: 6,116 Bytes
85b073f
 
 
 
 
 
 
 
 
1bc7e21
 
 
 
 
 
 
85b073f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1bc7e21
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
---
license: apache-2.0
datasets:
- GetSoloTech/Code-Reasoning
language:
- en
base_model:
- GetSoloTech/Qwen3-Code-Reasoning-4B
pipeline_tag: text-generation
tags:
- coding
- reasoning
- problem-solving
- algorithms
- python
- c++
---

# GetSoloTech/Qwen3-Code-Reasoning-4B-GGUF

This is the GGUF quantized version of the [Qwen3-Code-Reasoning-4B](https://huggingface.co/GetSoloTech/Qwen3-Code-Reasoning-4B) model, specifically optimized for competitive programming and code reasoning tasks. This model has been trained on the high-quality Code-Reasoning dataset to enhance its capabilities in solving complex programming problems with detailed reasoning.


## πŸš€ Key Features

* **Enhanced Code Reasoning**: Specifically trained on competitive programming problems
* **Thinking Capabilities**: Inherits the advanced reasoning capabilities from the base model
* **High-Quality Solutions**: Trained on solutions with β‰₯85% test case pass rates
* **Structured Output**: Optimized for generating well-reasoned programming solutions
* **Efficient Inference**: GGUF format enables fast inference on CPU and GPU
* **Multiple Quantization Levels**: Available in various precision levels for different hardware requirements

### Dataset Statistics

* **Split**: Python
* **Source**: High-quality competitive programming problems from TACO, APPS, CodeContests, and Codeforces
* **Quality Filter**: Only correctly solved problems with β‰₯85% test case pass rates

## πŸ”§ Usage

### Using with llama.cpp

```bash
# Download the model (choose your preferred quantization)
wget https://huggingface.co/GetSoloTech/Qwen3-Code-Reasoning-4B-GGUF/resolve/main/qwen3-code-reasoning-4b.Q4_K_M.gguf

# Run inference
./llama.cpp -m qwen3-code-reasoning-4b.Q4_K_M.gguf -n 4096 --repeat_penalty 1.1 -p "You are an expert competitive programmer. Read the problem and produce a correct, efficient solution. Include reasoning if helpful.\n\nProblem: Your programming problem here..."
```

### Using with Python (llama-cpp-python)

```python
from llama_cpp import Llama

# Load the model
llm = Llama(
    model_path="./qwen3-code-reasoning-4b.Q4_K_M.gguf",
    n_ctx=4096,
    n_threads=4
)

# Prepare input for competitive programming problem
prompt = """You are an expert competitive programmer. Read the problem and produce a correct, efficient solution. Include reasoning if helpful.

Problem: Your programming problem here..."""

# Generate solution
output = llm(
    prompt,
    max_tokens=4096,
    temperature=0.7,
    top_p=0.8,
    top_k=20,
    repeat_penalty=1.1
)

print(output['choices'][0]['text'])
```

### Using with Ollama

```bash
# Create a Modelfile
cat > Modelfile << EOF
FROM ./qwen3-code-reasoning-4b.Q4_K_M.gguf
TEMPLATE """{{ if .System }}<|im_start|>system
{{ .System }}<|im_end|>
{{ end }}{{ if .Prompt }}<|im_start|>user
{{ .Prompt }}<|im_end|>
{{ end }}<|im_start|>assistant
"""
PARAMETER temperature 0.7
PARAMETER top_p 0.8
PARAMETER top_k 20
PARAMETER repeat_penalty 1.1
EOF

# Create and run the model
ollama create qwen3-code-reasoning -f Modelfile
ollama run qwen3-code-reasoning "Solve this competitive programming problem: [your problem here]"
```

## πŸ“Š Available Quantizations

| Quantization | Size | Memory Usage | Quality | Use Case |
|--------------|------|--------------|---------|----------|
| Q3_K_M | 2.08 GB | ~3 GB | Good | CPU inference, limited memory |
| Q4_K_M | 2.5 GB | ~4 GB | Better | Balanced performance/memory |
| Q5_K_M | 2.89 GB | ~5 GB | Very Good | High quality, moderate memory |
| Q6_K | 3.31 GB | ~6 GB | Excellent | High quality, more memory |
| Q8_0 | 4.28 GB | ~8 GB | Best | Maximum quality, high memory |
| F16 | 8.05 GB | ~16 GB | Original | Maximum quality, GPU recommended |

## πŸ“ˆ Performance Expectations

This GGUF quantized model maintains the performance characteristics of the original finetuned model:

* **Competitive Programming Problems**: Better understanding of problem constraints and requirements
* **Code Generation**: More accurate and efficient solutions
* **Reasoning Quality**: Enhanced step-by-step reasoning for complex problems
* **Solution Completeness**: More comprehensive solutions with proper edge case handling

## πŸŽ›οΈ Recommended Settings

### For Code Generation

* **Temperature**: 0.7
* **Top-p**: 0.8
* **Top-k**: 20
* **Max New Tokens**: 4096 (adjust based on problem complexity)
* **Repeat Penalty**: 1.1

### For Reasoning Tasks

* **Temperature**: 0.6
* **Top-p**: 0.95
* **Top-k**: 20
* **Max New Tokens**: 8192 (for complex reasoning)
* **Repeat Penalty**: 1.1

## πŸ› οΈ Hardware Requirements

### Minimum Requirements
* **RAM**: 4 GB (for Q3_K_M quantization)
* **Storage**: 2.5 GB free space
* **CPU**: Multi-core processor recommended

### Recommended Requirements
* **RAM**: 8 GB or more
* **Storage**: 5 GB free space
* **GPU**: NVIDIA GPU with 4GB+ VRAM (optional, for faster inference)

## 🀝 Contributing

This GGUF model was converted from the original LoRA-finetuned model. For questions about:

* The original model: [GetSoloTech/Qwen3-Code-Reasoning-4B](https://huggingface.co/GetSoloTech/Qwen3-Code-Reasoning-4B)
* The base model: [Qwen3 GitHub](https://github.com/QwenLM/Qwen3)
* The training dataset: [Code-Reasoning Repository](https://huggingface.co/datasets/GetSoloTech/Code-Reasoning)
* The training framework: [Unsloth Documentation](https://github.com/unslothai/unsloth)

## πŸ“„ License

This model follows the same license as the base model (Apache 2.0). Please refer to the base model license for details.

## πŸ™ Acknowledgments

* **Qwen Team** for the excellent base model
* **Unsloth Team** for the efficient training framework
* **NVIDIA Research** for the original OpenCodeReasoning-2 dataset
* **llama.cpp community** for the GGUF format and tools

## πŸ“ž Contact

For questions about this GGUF model, please open an issue in the repository.

---

**Note**: This model is specifically optimized for competitive programming and code reasoning tasks. The GGUF format enables efficient inference on various hardware configurations while maintaining the model's reasoning capabilities.