🧠 Model Card: Qwen2.5-3b-kk-distilled

🧬 Model Description

This is a distilled language model fine-tuned on reasoning traces derived from the QwQ-32B model using the the Knights and Knaves logic puzzles. The base model is Qwen2.5-3B.

📄 Associated Paper

Title: Warm Up Before You Train: Unlocking General Reasoning in Resource-Constrained Settings

arXiv: https://arxiv.org/pdf/2505.13718

hf papers: https://huggingface.co/papers/2505.13718

📚 Model Details

Base model: Qwen2.5-3B
Training data: QwQ-Knights-and-Knaves-Traces

safal312
/

qwen2.5-3b-kk-distilled

🧠 Model Card: Qwen2.5-3b-kk-distilled

🧬 Model Description

📄 Associated Paper

📚 Model Details

Model tree for safal312/qwen2.5-3b-kk-distilled

Collection including safal312/qwen2.5-3b-kk-distilled

Warmup Before You Train