|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- GetSoloTech/Code-Reasoning |
|
language: |
|
- en |
|
base_model: |
|
- GetSoloTech/Qwen3-Code-Reasoning-4B |
|
pipeline_tag: text-generation |
|
tags: |
|
- coding |
|
- reasoning |
|
- problem-solving |
|
- algorithms |
|
- python |
|
- c++ |
|
--- |
|
|
|
# GetSoloTech/Qwen3-Code-Reasoning-4B-GGUF |
|
|
|
This is the GGUF quantized version of the [Qwen3-Code-Reasoning-4B](https://huggingface.co/GetSoloTech/Qwen3-Code-Reasoning-4B) model, specifically optimized for competitive programming and code reasoning tasks. This model has been trained on the high-quality Code-Reasoning dataset to enhance its capabilities in solving complex programming problems with detailed reasoning. |
|
|
|
|
|
## π Key Features |
|
|
|
* **Enhanced Code Reasoning**: Specifically trained on competitive programming problems |
|
* **Thinking Capabilities**: Inherits the advanced reasoning capabilities from the base model |
|
* **High-Quality Solutions**: Trained on solutions with β₯85% test case pass rates |
|
* **Structured Output**: Optimized for generating well-reasoned programming solutions |
|
* **Efficient Inference**: GGUF format enables fast inference on CPU and GPU |
|
* **Multiple Quantization Levels**: Available in various precision levels for different hardware requirements |
|
|
|
### Dataset Statistics |
|
|
|
* **Split**: Python |
|
* **Source**: High-quality competitive programming problems from TACO, APPS, CodeContests, and Codeforces |
|
* **Quality Filter**: Only correctly solved problems with β₯85% test case pass rates |
|
|
|
## π§ Usage |
|
|
|
### Using with llama.cpp |
|
|
|
```bash |
|
# Download the model (choose your preferred quantization) |
|
wget https://huggingface.co/GetSoloTech/Qwen3-Code-Reasoning-4B-GGUF/resolve/main/qwen3-code-reasoning-4b.Q4_K_M.gguf |
|
|
|
# Run inference |
|
./llama.cpp -m qwen3-code-reasoning-4b.Q4_K_M.gguf -n 4096 --repeat_penalty 1.1 -p "You are an expert competitive programmer. Read the problem and produce a correct, efficient solution. Include reasoning if helpful.\n\nProblem: Your programming problem here..." |
|
``` |
|
|
|
### Using with Python (llama-cpp-python) |
|
|
|
```python |
|
from llama_cpp import Llama |
|
|
|
# Load the model |
|
llm = Llama( |
|
model_path="./qwen3-code-reasoning-4b.Q4_K_M.gguf", |
|
n_ctx=4096, |
|
n_threads=4 |
|
) |
|
|
|
# Prepare input for competitive programming problem |
|
prompt = """You are an expert competitive programmer. Read the problem and produce a correct, efficient solution. Include reasoning if helpful. |
|
|
|
Problem: Your programming problem here...""" |
|
|
|
# Generate solution |
|
output = llm( |
|
prompt, |
|
max_tokens=4096, |
|
temperature=0.7, |
|
top_p=0.8, |
|
top_k=20, |
|
repeat_penalty=1.1 |
|
) |
|
|
|
print(output['choices'][0]['text']) |
|
``` |
|
|
|
### Using with Ollama |
|
|
|
```bash |
|
# Create a Modelfile |
|
cat > Modelfile << EOF |
|
FROM ./qwen3-code-reasoning-4b.Q4_K_M.gguf |
|
TEMPLATE """{{ if .System }}<|im_start|>system |
|
{{ .System }}<|im_end|> |
|
{{ end }}{{ if .Prompt }}<|im_start|>user |
|
{{ .Prompt }}<|im_end|> |
|
{{ end }}<|im_start|>assistant |
|
""" |
|
PARAMETER temperature 0.7 |
|
PARAMETER top_p 0.8 |
|
PARAMETER top_k 20 |
|
PARAMETER repeat_penalty 1.1 |
|
EOF |
|
|
|
# Create and run the model |
|
ollama create qwen3-code-reasoning -f Modelfile |
|
ollama run qwen3-code-reasoning "Solve this competitive programming problem: [your problem here]" |
|
``` |
|
|
|
## π Available Quantizations |
|
|
|
| Quantization | Size | Memory Usage | Quality | Use Case | |
|
|--------------|------|--------------|---------|----------| |
|
| Q3_K_M | 2.08 GB | ~3 GB | Good | CPU inference, limited memory | |
|
| Q4_K_M | 2.5 GB | ~4 GB | Better | Balanced performance/memory | |
|
| Q5_K_M | 2.89 GB | ~5 GB | Very Good | High quality, moderate memory | |
|
| Q6_K | 3.31 GB | ~6 GB | Excellent | High quality, more memory | |
|
| Q8_0 | 4.28 GB | ~8 GB | Best | Maximum quality, high memory | |
|
| F16 | 8.05 GB | ~16 GB | Original | Maximum quality, GPU recommended | |
|
|
|
## π Performance Expectations |
|
|
|
This GGUF quantized model maintains the performance characteristics of the original finetuned model: |
|
|
|
* **Competitive Programming Problems**: Better understanding of problem constraints and requirements |
|
* **Code Generation**: More accurate and efficient solutions |
|
* **Reasoning Quality**: Enhanced step-by-step reasoning for complex problems |
|
* **Solution Completeness**: More comprehensive solutions with proper edge case handling |
|
|
|
## ποΈ Recommended Settings |
|
|
|
### For Code Generation |
|
|
|
* **Temperature**: 0.7 |
|
* **Top-p**: 0.8 |
|
* **Top-k**: 20 |
|
* **Max New Tokens**: 4096 (adjust based on problem complexity) |
|
* **Repeat Penalty**: 1.1 |
|
|
|
### For Reasoning Tasks |
|
|
|
* **Temperature**: 0.6 |
|
* **Top-p**: 0.95 |
|
* **Top-k**: 20 |
|
* **Max New Tokens**: 8192 (for complex reasoning) |
|
* **Repeat Penalty**: 1.1 |
|
|
|
## π οΈ Hardware Requirements |
|
|
|
### Minimum Requirements |
|
* **RAM**: 4 GB (for Q3_K_M quantization) |
|
* **Storage**: 2.5 GB free space |
|
* **CPU**: Multi-core processor recommended |
|
|
|
### Recommended Requirements |
|
* **RAM**: 8 GB or more |
|
* **Storage**: 5 GB free space |
|
* **GPU**: NVIDIA GPU with 4GB+ VRAM (optional, for faster inference) |
|
|
|
## π€ Contributing |
|
|
|
This GGUF model was converted from the original LoRA-finetuned model. For questions about: |
|
|
|
* The original model: [GetSoloTech/Qwen3-Code-Reasoning-4B](https://huggingface.co/GetSoloTech/Qwen3-Code-Reasoning-4B) |
|
* The base model: [Qwen3 GitHub](https://github.com/QwenLM/Qwen3) |
|
* The training dataset: [Code-Reasoning Repository](https://huggingface.co/datasets/GetSoloTech/Code-Reasoning) |
|
* The training framework: [Unsloth Documentation](https://github.com/unslothai/unsloth) |
|
|
|
## π License |
|
|
|
This model follows the same license as the base model (Apache 2.0). Please refer to the base model license for details. |
|
|
|
## π Acknowledgments |
|
|
|
* **Qwen Team** for the excellent base model |
|
* **Unsloth Team** for the efficient training framework |
|
* **NVIDIA Research** for the original OpenCodeReasoning-2 dataset |
|
* **llama.cpp community** for the GGUF format and tools |
|
|
|
## π Contact |
|
|
|
For questions about this GGUF model, please open an issue in the repository. |
|
|
|
--- |
|
|
|
**Note**: This model is specifically optimized for competitive programming and code reasoning tasks. The GGUF format enables efficient inference on various hardware configurations while maintaining the model's reasoning capabilities. |