Transformers
Safetensors
English
ajaypanigrahi's picture
Update README.md
282b421 verified
metadata
license: mit
datasets:
  - mlabonne/FineTome-100k
  - microsoft/orca-math-word-problems-200k
language:
  - en
base_model:
  - unsloth/Llama-3.2-3B-Instruct-unsloth-bnb-4bit
library_name: transformers

πŸ”§ LoRA Adapter: Fine-tuned LLaMA 3.2B on FinetuneMe + Orca (4-bit, Unsloth)

This repository contains a LoRA adapter trained on a combined dataset of FinetuneMe and Orca using the Unsloth LLaMA 3.2B 4-bit model as the base.

It is intended to be used with the base model unsloth/llama-3.2-3b-instruct-unsloth-bnb-4bit.


🧠 Model Architecture


πŸ‹οΈβ€β™‚οΈ Training Configuration

  • Max Steps: 600
  • Batch Size: 2 per device
  • Gradient Accumulation: 4 steps
  • Max Sequence Length: 2048 tokens
  • Learning Rate: 2e-4
  • Warmup Steps: 5
  • Optimizer: paged_adamw_8bit
  • Precision: Mixed (fp16 or bf16 based on GPU support)

πŸ”Ž Inference Instructions

To use this adapter, you must first load the base model and then apply this LoRA adapter on top.

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

# Load base model
base_model = "unsloth/llama-3.2-3b-instruct-unsloth-bnb-4bit"
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    device_map="auto",
    trust_remote_code=True
)

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model)

# Load adapter (replace this with your actual repo link)
adapter_repo = "ajaypanigrahi/Llama-3.2-3B-instruct-finetunme-orca-lora-600steps"
model = PeftModel.from_pretrained(model, adapter_repo)

# Inference
prompt = "Write a Python function that returns the first 8 Fibonacci numbers."
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))