🧠 Phi-2 LoRA Adapter for GSM8K (Math Word Problems)

This repository contains a parameter-efficient LoRA fine-tuning of microsoft/phi-2 on the GSM8K dataset, designed for solving grade-school arithmetic and reasoning problems in natural language.

✅ Adapter-only: This is a LoRA adapter, not a full model. You must load it on top of microsoft/phi-2.

✨ What's Inside

Base Model: microsoft/phi-2 (1.7B parameters)
Adapter Type: LoRA (Low-Rank Adaptation via PEFT)
Task: Grade-school math reasoning (multi-step logic and arithmetic)
Dataset: GSM8K

🚀 Quick Start

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

# Load base model and tokenizer
base_model = AutoModelForCausalLM.from_pretrained("microsoft/phi-2")
tokenizer = AutoTokenizer.from_pretrained("darshjoshi16/phi2-lora-math")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "darshjoshi16/phi2-lora-math")

# Inference
prompt = "Q: Julie read 12 pages yesterday and twice as many today. If she wants to read half of the remaining 84 pages tomorrow, how many pages should she read?\nA:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

📊 Evaluation Results

Task	Metric	Score	Samples
GSM8K	Exact Match (strict)	54.6%	500
ARC-Easy	Accuracy	79.0%	500
HellaSwag	Accuracy (Normalized)	61.0%	500

Benchmarks were run using EleutherAI’s lm-eval-harness

⚙️ Training Details

Method: LoRA (rank=8, alpha=16, dropout=0.1)
Epochs: 1 (proof of concept)
Batch size: 4 per device
Precision: FP16
Platform: Google Colab (T4 GPU)
Framework: 🤗 Transformers + PEFT

🔍 Limitations

Fine-tuned for math problems only (not general-purpose reasoning)
Trained for 1 epoch — additional training may improve performance
Adapter-only: base model (microsoft/phi-2) must be loaded alongside

📘 Citation & References

💬 Author

This model was fine-tuned and open-sourced by Darsh Joshi.
Feel free to reach out or contribute.

darshjoshi16
/

phi2-lora-math