zeeshaan-ai's picture
Update README.md
1bc7e21 verified
---
license: apache-2.0
datasets:
- GetSoloTech/Code-Reasoning
language:
- en
base_model:
- GetSoloTech/Qwen3-Code-Reasoning-4B
pipeline_tag: text-generation
tags:
- coding
- reasoning
- problem-solving
- algorithms
- python
- c++
---
# GetSoloTech/Qwen3-Code-Reasoning-4B-GGUF
This is the GGUF quantized version of the [Qwen3-Code-Reasoning-4B](https://huggingface.co/GetSoloTech/Qwen3-Code-Reasoning-4B) model, specifically optimized for competitive programming and code reasoning tasks. This model has been trained on the high-quality Code-Reasoning dataset to enhance its capabilities in solving complex programming problems with detailed reasoning.
## πŸš€ Key Features
* **Enhanced Code Reasoning**: Specifically trained on competitive programming problems
* **Thinking Capabilities**: Inherits the advanced reasoning capabilities from the base model
* **High-Quality Solutions**: Trained on solutions with β‰₯85% test case pass rates
* **Structured Output**: Optimized for generating well-reasoned programming solutions
* **Efficient Inference**: GGUF format enables fast inference on CPU and GPU
* **Multiple Quantization Levels**: Available in various precision levels for different hardware requirements
### Dataset Statistics
* **Split**: Python
* **Source**: High-quality competitive programming problems from TACO, APPS, CodeContests, and Codeforces
* **Quality Filter**: Only correctly solved problems with β‰₯85% test case pass rates
## πŸ”§ Usage
### Using with llama.cpp
```bash
# Download the model (choose your preferred quantization)
wget https://huggingface.co/GetSoloTech/Qwen3-Code-Reasoning-4B-GGUF/resolve/main/qwen3-code-reasoning-4b.Q4_K_M.gguf
# Run inference
./llama.cpp -m qwen3-code-reasoning-4b.Q4_K_M.gguf -n 4096 --repeat_penalty 1.1 -p "You are an expert competitive programmer. Read the problem and produce a correct, efficient solution. Include reasoning if helpful.\n\nProblem: Your programming problem here..."
```
### Using with Python (llama-cpp-python)
```python
from llama_cpp import Llama
# Load the model
llm = Llama(
model_path="./qwen3-code-reasoning-4b.Q4_K_M.gguf",
n_ctx=4096,
n_threads=4
)
# Prepare input for competitive programming problem
prompt = """You are an expert competitive programmer. Read the problem and produce a correct, efficient solution. Include reasoning if helpful.
Problem: Your programming problem here..."""
# Generate solution
output = llm(
prompt,
max_tokens=4096,
temperature=0.7,
top_p=0.8,
top_k=20,
repeat_penalty=1.1
)
print(output['choices'][0]['text'])
```
### Using with Ollama
```bash
# Create a Modelfile
cat > Modelfile << EOF
FROM ./qwen3-code-reasoning-4b.Q4_K_M.gguf
TEMPLATE """{{ if .System }}<|im_start|>system
{{ .System }}<|im_end|>
{{ end }}{{ if .Prompt }}<|im_start|>user
{{ .Prompt }}<|im_end|>
{{ end }}<|im_start|>assistant
"""
PARAMETER temperature 0.7
PARAMETER top_p 0.8
PARAMETER top_k 20
PARAMETER repeat_penalty 1.1
EOF
# Create and run the model
ollama create qwen3-code-reasoning -f Modelfile
ollama run qwen3-code-reasoning "Solve this competitive programming problem: [your problem here]"
```
## πŸ“Š Available Quantizations
| Quantization | Size | Memory Usage | Quality | Use Case |
|--------------|------|--------------|---------|----------|
| Q3_K_M | 2.08 GB | ~3 GB | Good | CPU inference, limited memory |
| Q4_K_M | 2.5 GB | ~4 GB | Better | Balanced performance/memory |
| Q5_K_M | 2.89 GB | ~5 GB | Very Good | High quality, moderate memory |
| Q6_K | 3.31 GB | ~6 GB | Excellent | High quality, more memory |
| Q8_0 | 4.28 GB | ~8 GB | Best | Maximum quality, high memory |
| F16 | 8.05 GB | ~16 GB | Original | Maximum quality, GPU recommended |
## πŸ“ˆ Performance Expectations
This GGUF quantized model maintains the performance characteristics of the original finetuned model:
* **Competitive Programming Problems**: Better understanding of problem constraints and requirements
* **Code Generation**: More accurate and efficient solutions
* **Reasoning Quality**: Enhanced step-by-step reasoning for complex problems
* **Solution Completeness**: More comprehensive solutions with proper edge case handling
## πŸŽ›οΈ Recommended Settings
### For Code Generation
* **Temperature**: 0.7
* **Top-p**: 0.8
* **Top-k**: 20
* **Max New Tokens**: 4096 (adjust based on problem complexity)
* **Repeat Penalty**: 1.1
### For Reasoning Tasks
* **Temperature**: 0.6
* **Top-p**: 0.95
* **Top-k**: 20
* **Max New Tokens**: 8192 (for complex reasoning)
* **Repeat Penalty**: 1.1
## πŸ› οΈ Hardware Requirements
### Minimum Requirements
* **RAM**: 4 GB (for Q3_K_M quantization)
* **Storage**: 2.5 GB free space
* **CPU**: Multi-core processor recommended
### Recommended Requirements
* **RAM**: 8 GB or more
* **Storage**: 5 GB free space
* **GPU**: NVIDIA GPU with 4GB+ VRAM (optional, for faster inference)
## 🀝 Contributing
This GGUF model was converted from the original LoRA-finetuned model. For questions about:
* The original model: [GetSoloTech/Qwen3-Code-Reasoning-4B](https://huggingface.co/GetSoloTech/Qwen3-Code-Reasoning-4B)
* The base model: [Qwen3 GitHub](https://github.com/QwenLM/Qwen3)
* The training dataset: [Code-Reasoning Repository](https://huggingface.co/datasets/GetSoloTech/Code-Reasoning)
* The training framework: [Unsloth Documentation](https://github.com/unslothai/unsloth)
## πŸ“„ License
This model follows the same license as the base model (Apache 2.0). Please refer to the base model license for details.
## πŸ™ Acknowledgments
* **Qwen Team** for the excellent base model
* **Unsloth Team** for the efficient training framework
* **NVIDIA Research** for the original OpenCodeReasoning-2 dataset
* **llama.cpp community** for the GGUF format and tools
## πŸ“ž Contact
For questions about this GGUF model, please open an issue in the repository.
---
**Note**: This model is specifically optimized for competitive programming and code reasoning tasks. The GGUF format enables efficient inference on various hardware configurations while maintaining the model's reasoning capabilities.