--- license: mit datasets: - GAIR/LIMO base_model: - deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B --- ## Model Overview **DeepSeek-R1-Distill-Qwen-1.5B-LIMO** is a fine-tuned version of **DeepSeek-R1-Distill-Qwen-1.5B**, trained on the **LIMO (Less Is More for Reasoning)** dataset. This fine-tune focuses on improving mathematical reasoning and problem-solving capabilities while maintaining efficient parameter scaling. The **LIMO dataset**, containing only 817 high-quality reasoning samples, challenges conventional scaling laws by demonstrating strong performance with minimal data. Model Name: `Josephgflowers/DeepSeek-R1-Distill-Qwen-1.5B-LIMO` --- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6328952f798f8d122ce62a44/ao51DQ6kgqd6iJIHxQcfY.png) ## Key Features - **Mathematical Reasoning Focus**: Trained on the **LIMO dataset** to enhance step-by-step logical deduction and problem-solving. - **Optimized for Efficiency**: Based on **DeepSeek-R1-Distill-Qwen-1.5B**, a smaller distilled model that retains powerful reasoning capabilities from its larger counterparts. - **Chain-of-Thought (CoT) Emphasis**: Encourages structured, step-by-step reasoning with improved interpretability. - **Minimal Data, Strong Generalization**: Leverages only **817** high-quality training samples to achieve competitive reasoning performance. --- ## Model Details - **Base Model**: DeepSeek-R1-Distill-Qwen-1.5B - **Parameter Count**: 1.5B - **Training Framework**: Unsloth / Hugging Face Transformers - **Dataset**: LIMO (Less Is More for Reasoning) - **Primary Use Cases**: - Mathematical and logical reasoning - STEM education and tutoring - Instruction-following for structured problem-solving --- ## Training Data This model was fine-tuned using the **LIMO dataset**, which emphasizes reasoning efficiency by achieving competitive performance with **only 817 training samples**. ### Dataset Highlights: - **Name**: LIMO (Less Is More for Reasoning) - **Size**: 817 samples - **Focus**: Chain-of-thought reasoning for structured problem-solving - **Key Motivation**: - High-quality, curated reasoning samples outperform large-scale noisy data. - Designed to enhance reasoning generalization in a minimal-data setting. --- ## Known Issues & Limitations - **System Instructions**: The model may require explicit user instructions to enforce structured reasoning. - **Generalization Scope**: While the LIMO dataset is effective for reasoning, additional fine-tuning may be required for broader NLP applications. - **Computation Constraints**: As a **1.5B** parameter model, it is more lightweight than larger reasoning models but still requires moderate computational resources. @misc{ye2025limoreasoning, title={LIMO: Less is More for Reasoning}, author={Yixin Ye and Zhen Huang and Yang Xiao and Ethan Chern and Shijie Xia and Pengfei Liu}, year={2025}, eprint={2502.03387}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2502.03387}, } @misc{deepseekai2025deepseekr1incentivizingreasoningcapability, title={DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning}, author={DeepSeek-AI and multiple contributors}, year={2025}, eprint={2501.12948}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2501.12948}, } ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6328952f798f8d122ce62a44/x4Uo-weQyocD0wvjN3JcT.png)