--- language: - en license: apache-2.0 tags: - text-generation - instruction-tuning - multi-task - reasoning - email - summarization - chat - peft - lora - qwen - deepseek base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B datasets: - HuggingFaceTB/smoltalk - snoop2head/enron_aeslc_emails - lucadiliello/STORIES - abisee/cnn_dailymail - wiki40b model_type: causal-lm inference: true library_name: peft pipeline_tag: text-generation --- # 🧠 Deepseek-R1-multitask-lora **Author:** Gilbert Akham **License:** Apache-2.0 **Base model:** [`deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B`](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) **Adapter type:** LoRA (PEFT) **Capabilities:** Multi-task generalization & reasoning --- # πŸš€ What It Can Do This multitask fine-tuned model handles a broad set of natural language and reasoning-based tasks, such as: βœ‰οΈ Email & message writing β€” generate clear, friendly, or professional communications. πŸ“– Story & creative writing β€” craft imaginative narratives, poems, and dialogues. πŸ’¬ Conversational chat β€” maintain coherent, context-aware conversations. πŸ’‘ Explanations & tutoring β€” explain technical or abstract topics simply. 🧩 Reasoning & logic tasks β€” provide step-by-step answers for analytical questions. πŸ’» Code generation & explanation β€” write and explain Python or general programming code. 🌍 Translation & summarization β€” translate between multiple languages or condense information. The model’s multi-domain training (based on datasets like SmolTalk, Everyday Conversations, and reasoning-rich samples) makes it suitable for assistants, chatbots, content generators, or educational tools. --- ## 🧩 Training Details | Parameter | Value | |------------|-------| | Base model | `deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B` | | Adapter | LoRA (r=8, alpha=32, dropout=0.1) | | Max sequence length | 1024 | | Learning rate | 3e-5 (cosine decay) | | Optimizer | `adamw_8bit` | | Grad Accumulation | 4 | | Precision | 4-bit quantized, FP16 compute | | Steps | 12k total (best @ ~8.2k) | | Training time | ~2.5h on A4000 | | Frameworks | πŸ€— Transformers, PEFT, TRL, BitsAndBytes | --- ## 🧠 Reasoning Capability Thanks to integration of **SmolTalk** and diverse multi-task prompts, the model learns: - **Chain-of-thought style reasoning** - **Conversational grounding** - **Multi-step logical inferences** - **Instruction following** across domains Example: ```text ### Task: Explain reasoning ### Input: If a train leaves City A at 3 PM and arrives at City B at 6 PM, covering 180 km, what is its average speed? ### Output: The train travels 180 km in 3 hours. Average speed = 180 Γ· 3 = 60 km/h.