DeepSeek-R1-Distill-Qwen-1.5B Fine-Tuned on GSM8K with Chain-of-Thought Augmentation

Model Overview

This model is a fine-tuned version of DeepSeek-R1-Distill-Qwen-1.5B, trained on the OpenAI GSM8K dataset, augmented with Chain-of-Thought (CoT) reasoning using DeepSeek-V3. The fine-tuning process enhances the model’s mathematical problem-solving abilities, allowing it to provide step-by-step solutions with deeper reasoning.

🔹 Key Features

Base Model: DeepSeek-R1-Distill-Qwen-1.5B
Fine-Tuned On: GSM8K dataset with DeepSeek-V3-enhanced reasoning
Improved Mathematical Reasoning: Generates detailed step-by-step CoT explanations
Optimized for GRPO Training: Trained using trl and unsloth for efficient fine-tuning

📊 Dataset & Training Details

Dataset: eagle0504/openai-gsm8k-enhanced-using-together-ai-deepseek-train8k-test1k-v1
- 8K train samples, 1K test samples
- Contains question, answer, and CoT reasoning
Training Methodology:
- Used Guided Reinforcement Policy Optimization (GRPO) via trl
- Applied gradient accumulation to manage larger batch sizes
- Integrated DeepSeek-V3 augmentation for enhanced logical reasoning
Fine-tuning Tools:
- Unsloth for memory-efficient Llama-based tuning
- Hugging Face Transformers for model training

For those interested in replicating the fine-tuning process, I have shared an updated Colab notebook 📓:
🔗 Colab Notebook

You will need:
✅ Hugging Face Token
✅ Together.AI API Key
✅ Unsloth Package

🚀 How to Run the Model (Mac via `llama.cpp`)

Yes! You can run this model locally on macOS using llama.cpp.

1️⃣ Install Homebrew (If Not Installed)

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

Then add Homebrew to your PATH:

echo 'eval "$(/opt/homebrew/bin/brew shellenv)"' >> ~/.zprofile
eval "$(/opt/homebrew/bin/brew shellenv)"

2️⃣ Install `llama.cpp`

brew install llama.cpp

3️⃣ Run the Model with `llama-cli`

llama-cli -hf eagle0504/deepseek-r1-qwen-1.5b-gsm8k-enhanced-gguf:Q8_0

4️⃣ Alternative: Run Locally via GGUF

mkdir -p ~/llama_models && cd ~/llama_models
wget https://huggingface.co/eagle0504/deepseek-r1-qwen-1.5b-gsm8k-enhanced-gguf/resolve/main/Q8_0.gguf
llama-cli -m ~/llama_models/Q8_0.gguf --interactive

📌 How to Use Model via Python (`transformers`)

You can load the model with Hugging Face Transformers:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "eagle0504/deepseek-r1-qwen-1.5b-gsm8k-enhanced"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

prompt = "A farmer has 24 apples. He gives 6 to each of his 3 children. How many does he have left?"
inputs = tokenizer(prompt, return_tensors="pt")
output = model.generate(**inputs, max_length=200)
print(tokenizer.decode(output[0], skip_special_tokens=True))

🔬 Expected Performance

Compared to the base DeepSeek-R1-Distill-Qwen-1.5B, this fine-tuned model:

Provides more detailed Chain-of-Thought (CoT) explanations for GSM8K problems.
Improves logical reasoning and step-by-step answer formulation.
Generates clearer, more structured solutions, making it ideal for educational use.

🗂 Model Hosting & License

📌 Model on Hugging Face Hub:
👉 eagle0504/deepseek-r1-qwen-1.5b-gsm8k-enhanced

📜 License: MIT License – Open for modification and distribution.

If you have feedback or ideas for improvement, feel free to reach out! 🚀🔥

#AI #MachineLearning #DeepSeek #GSM8K #LLM #ChainOfThought #HuggingFace #GRPO #Reasoning ```

eagle0504
/

qwen-2-5-3b-instruct-using-openai-gsm8k-data-enhanced-with-deepseek-v4

DeepSeek-R1-Distill-Qwen-1.5B Fine-Tuned on GSM8K with Chain-of-Thought Augmentation

Model Overview

🔹 Key Features

📊 Dataset & Training Details

🚀 How to Run the Model (Mac via `llama.cpp`)

1️⃣ Install Homebrew (If Not Installed)

2️⃣ Install `llama.cpp`

3️⃣ Run the Model with `llama-cli`

4️⃣ Alternative: Run Locally via GGUF

📌 How to Use Model via Python (`transformers`)

🔬 Expected Performance

🗂 Model Hosting & License

Model tree for eagle0504/qwen-2-5-3b-instruct-using-openai-gsm8k-data-enhanced-with-deepseek-v4

Dataset used to train eagle0504/qwen-2-5-3b-instruct-using-openai-gsm8k-data-enhanced-with-deepseek-v4

DeepSeek-R1-Distill-Qwen-1.5B Fine-Tuned on GSM8K with Chain-of-Thought Augmentation

Model Overview

🔹 Key Features

📊 Dataset & Training Details

🚀 How to Run the Model (Mac via llama.cpp)

1️⃣ Install Homebrew (If Not Installed)

2️⃣ Install llama.cpp

3️⃣ Run the Model with llama-cli

4️⃣ Alternative: Run Locally via GGUF

📌 How to Use Model via Python (transformers)

🔬 Expected Performance

🗂 Model Hosting & License

Model tree for eagle0504/qwen-2-5-3b-instruct-using-openai-gsm8k-data-enhanced-with-deepseek-v4

Dataset used to train eagle0504/qwen-2-5-3b-instruct-using-openai-gsm8k-data-enhanced-with-deepseek-v4

🚀 How to Run the Model (Mac via `llama.cpp`)

2️⃣ Install `llama.cpp`

3️⃣ Run the Model with `llama-cli`

📌 How to Use Model via Python (`transformers`)