Transformers
Safetensors
English

πŸ”§ LoRA Adapter: Fine-tuned LLaMA 3.2B on FinetuneMe + Orca (4-bit, Unsloth)

This repository contains a LoRA adapter trained on a combined dataset of FinetuneMe and Orca using the Unsloth LLaMA 3.2B 4-bit model as the base.

It is intended to be used with the base model unsloth/llama-3.2-3b-instruct-unsloth-bnb-4bit.


🧠 Model Architecture


πŸ‹οΈβ€β™‚οΈ Training Configuration

  • Max Steps: 600
  • Batch Size: 2 per device
  • Gradient Accumulation: 4 steps
  • Max Sequence Length: 2048 tokens
  • Learning Rate: 2e-4
  • Warmup Steps: 5
  • Optimizer: paged_adamw_8bit
  • Precision: Mixed (fp16 or bf16 based on GPU support)

πŸ”Ž Inference Instructions

To use this adapter, you must first load the base model and then apply this LoRA adapter on top.

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

# Load base model
base_model = "unsloth/llama-3.2-3b-instruct-unsloth-bnb-4bit"
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    device_map="auto",
    trust_remote_code=True
)

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model)

# Load adapter (replace this with your actual repo link)
adapter_repo = "ajaypanigrahi/Llama-3.2-3B-instruct-finetunme-orca-lora-600steps"
model = PeftModel.from_pretrained(model, adapter_repo)

# Inference
prompt = "Write a Python function that returns the first 8 Fibonacci numbers."
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for ajaypanigrahi/Llama-3.2-3B-instruct-finetunme-orca-lora-600steps

Datasets used to train ajaypanigrahi/Llama-3.2-3B-instruct-finetunme-orca-lora-600steps