unsloth-llama3-alpaca-lora

HF Demo Space GitHub Stars License

A 4-bit QLoRA adapter for unsloth/llama-3-8b-bnb-4bit, fine-tuned on the Stanford Alpaca dataset (52K instructions). Lightweight, efficient, and open. Built with Unsloth, HF PEFT, and 🤗 Datasets for low-resource, instruction-following tasks. Adapter weights only. Reproducible and ready to deploy.

👉 Full training, evaluation, and deployment code available at GitHub: Cre4T3Tiv3/unsloth-llama3-alpaca-lora


How to Use

Merge Adapter into Base Model

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

BASE_MODEL = "unsloth/llama-3-8b-bnb-4bit"
ADAPTER = "Cre4T3Tiv3/unsloth-llama3-alpaca-lora"

# Load base model and adapter
base_model = AutoModelForCausalLM.from_pretrained(BASE_MODEL, device_map="auto", load_in_4bit=True)
model = PeftModel.from_pretrained(base_model, ADAPTER)
model = model.merge_and_unload()

tokenizer = AutoTokenizer.from_pretrained(ADAPTER)

# Run inference
prompt = """### Instruction:
What is QLoRA?

### Response:"""
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

LoRA Training Configuration

Parameter Value
Base Model unsloth/llama-3-8b-bnb-4bit
r 16
alpha 16
dropout 0.05
Bits 4-bit (bnb)
Framework Unsloth + HuggingFace PEFT
Adapter Format LoRA (merged post-training)

Dataset

  • yahma/alpaca-cleaned
  • Augmented with 30+ grounded examples explaining QLoRA to mitigate hallucinations

Hardware Used

  • A100 (40GB VRAM)

Evaluation

This adapter was evaluated using a custom script to detect:

  • QLoRA hallucination (e.g. “Quantized Linear Regression”) ✅ Mitigated
  • Keyword coverage across instruction outputs (≥4/6 match)
  • Response quality on instruction-following examples

See eval_adapter.py in the GitHub repo for reproducibility.


Limitations

  • May hallucinate
  • Not intended for factual QA or decision-critical workflows
  • Output subject to 4-bit quantization limitations

Intended Use

This adapter is designed for:

  • Local inference using QLoRA-efficient weights
  • Instruction-following in interactive, UI, or CLI agents
  • Experimentation with LoRA/PEFT pipelines
  • Educational demos of efficient fine-tuning

Demo

🖥 Try the adapter in a browser:
👉 HF Space → unsloth-llama3-alpaca-demo


Built With


Maintainer

@Cre4T3Tiv3
Built with ❤️ by ByteStack Labs


Citation

If you use this adapter or its training methodology, please consider citing:

@software{unsloth-llama3-alpaca-lora,
  author = {Jesse Moses, Cre4T3Tiv3},
  title = {Unsloth LoRA Adapter for LLaMA 3 (8B)},
  year = {2025},
  url = {https://huggingface.co/Cre4T3Tiv3/unsloth-llama3-alpaca-lora},
}

License

Apache 2.0


Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train Cre4T3Tiv3/unsloth-llama3-alpaca-lora

Evaluation results