Warmup Before You Train
Collection
3 items
β’
Updated
β’
2
This is a distilled language model fine-tuned on reasoning traces derived from the QwQ-32B model using the the Knights and Knaves logic puzzles. The base model is Qwen2.5-3B.
Title: Warm Up Before You Train: Unlocking General Reasoning in Resource-Constrained Settings
arXiv: https://arxiv.org/pdf/2505.13718
hf papers: https://huggingface.co/papers/2505.13718
Base model
Qwen/Qwen2.5-3B