ποΈ My First LoRA "Trash" Model - Educational Failure Case
β οΈ Warning: This model produces hilariously incoherent outputs!
This is my very first attempt at LoRA fine-tuning, shared for educational purposes. The model generates mostly gibberish, making it a perfect example of what can go wrong when learning parameter-efficient fine-tuning.
π€ Sample "Trash" Outputs
Q: "What is deep learning?"
A: "Deep learning is a way to understand the data that is being collected. It is a way to display the data that is used to analyze the data..."
Q: "How do you debug a Python program?"
A: "The debug code is :"
Q: "Explain overfitting"
A: "Overfitting the size of the car is a very common technique for removing a car from the vehicle..."
Yes, it really thinks overfitting is about cars! π
π What Went Wrong?
- Poor Input Formatting: Used plain text instead of structured instruction format
- Bad Generation Parameters: Temperature too high, no stopping criteria
- Wrong Model Choice: DialoGPT isn't ideal for instruction following
- Missing Special Tokens: No clear instruction/response boundaries
π§ What I Learned
This beautiful failure taught me:
- The critical importance of data formatting in LLM fine-tuning
- How generation parameters dramatically affect output quality
- Why model architecture choice matters for different tasks
- That LoRA training can succeed technically while failing practically
π Technical Details
- Base Model: microsoft/DialoGPT-small (117M params)
- LoRA Rank: 8
- Target Modules: ["c_attn", "c_proj"]
- Training Data: Alpaca dataset (poorly formatted)
- Training Loss: Actually decreased! (But outputs still terrible)
- Trainable Parameters: ~262k (0.2% of total)
π How to Use (For Science!)
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
# Load the trash model
tokenizer = AutoTokenizer.from_pretrained("microsoft/DialoGPT-small")
base_model = AutoModelForCausalLM.from_pretrained("microsoft/DialoGPT-small")
model = PeftModel.from_pretrained(base_model, "Tanaybh/my-first-lora-trash-model")
# Generate hilariously bad responses
def generate_trash(prompt):
inputs = tokenizer.encode(f"Instruction: {prompt}\nResponse:", return_tensors="pt")
outputs = model.generate(inputs, max_length=100, temperature=0.7, do_sample=True)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
# Try it out!
print(generate_trash("What is machine learning?"))
# Expect something like: "Machine learning is when computers learn to computer the learning..."
π οΈ The Fix
After this failure, I learned to:
- Use proper instruction formatting with special tokens
- Lower generation temperature (0.1 instead of 0.7)
- Add clear start/stop markers
- Choose better base models for instruction following
π― Educational Value
This model is perfect for:
- Understanding common LoRA fine-tuning pitfalls
- Demonstrating the importance of proper data formatting
- Teaching debugging skills for LLM training
- Showing that technical success β practical success
π Links
- Fixed Version: [Coming soon after I improve it!]
- Training Code: See files in this repo
- Discussion: Feel free to open issues with questions!
π·οΈ Tags
#LoRA #EducationalFailure #MachineLearning #LearningJourney #InstructionTuning
Remember: Every expert was once a beginner who made mistakes like this! Share your failures, they're often more valuable than your successes. π
- Downloads last month
- 3
Model tree for Tanaybh/my-first-lora-trash-model
Base model
microsoft/DialoGPT-small