🧠 Model Card: Qwen2.5-3b-kk-distilled

🧬 Model Description

This is a distilled language model fine-tuned on reasoning traces derived from the QwQ-32B model using the the Knights and Knaves logic puzzles. The base model is Qwen2.5-3B.


πŸ“„ Associated Paper

Title: Warm Up Before You Train: Unlocking General Reasoning in Resource-Constrained Settings

arXiv: https://arxiv.org/pdf/2505.13718

hf papers: https://huggingface.co/papers/2505.13718


πŸ“š Model Details


Downloads last month
22
Safetensors
Model size
3.09B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for safal312/qwen2.5-3b-kk-distilled

Base model

Qwen/Qwen2.5-3B
Finetuned
(173)
this model

Collection including safal312/qwen2.5-3b-kk-distilled