unsloth-llama3-alpaca-lora
A 4-bit QLoRA adapter for unsloth/llama-3-8b-bnb-4bit
, fine-tuned on the Stanford Alpaca dataset (52K instructions). Lightweight, efficient, and open. Built with Unsloth, HF PEFT, and 🤗 Datasets for low-resource, instruction-following tasks. Adapter weights only. Reproducible and ready to deploy.
👉 Full training, evaluation, and deployment code available at GitHub: Cre4T3Tiv3/unsloth-llama3-alpaca-lora
How to Use
Merge Adapter into Base Model
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
BASE_MODEL = "unsloth/llama-3-8b-bnb-4bit"
ADAPTER = "Cre4T3Tiv3/unsloth-llama3-alpaca-lora"
# Load base model and adapter
base_model = AutoModelForCausalLM.from_pretrained(BASE_MODEL, device_map="auto", load_in_4bit=True)
model = PeftModel.from_pretrained(base_model, ADAPTER)
model = model.merge_and_unload()
tokenizer = AutoTokenizer.from_pretrained(ADAPTER)
# Run inference
prompt = """### Instruction:
What is QLoRA?
### Response:"""
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
LoRA Training Configuration
Parameter | Value |
---|---|
Base Model | unsloth/llama-3-8b-bnb-4bit |
r | 16 |
alpha | 16 |
dropout | 0.05 |
Bits | 4-bit (bnb) |
Framework | Unsloth + HuggingFace PEFT |
Adapter Format | LoRA (merged post-training) |
Dataset
yahma/alpaca-cleaned
- Augmented with 30+ grounded examples explaining QLoRA to mitigate hallucinations
Hardware Used
- A100 (40GB VRAM)
Evaluation
This adapter was evaluated using a custom script to detect:
- QLoRA hallucination (e.g. “Quantized Linear Regression”) ✅ Mitigated
- Keyword coverage across instruction outputs (≥4/6 match)
- Response quality on instruction-following examples
See eval_adapter.py
in the GitHub repo for reproducibility.
Limitations
- May hallucinate
- Not intended for factual QA or decision-critical workflows
- Output subject to 4-bit quantization limitations
Intended Use
This adapter is designed for:
- Local inference using QLoRA-efficient weights
- Instruction-following in interactive, UI, or CLI agents
- Experimentation with LoRA/PEFT pipelines
- Educational demos of efficient fine-tuning
Demo
🖥 Try the adapter in a browser:
👉 HF Space → unsloth-llama3-alpaca-demo
Built With
Maintainer
@Cre4T3Tiv3
Built with ❤️ by ByteStack Labs
Citation
If you use this adapter or its training methodology, please consider citing:
@software{unsloth-llama3-alpaca-lora,
author = {Jesse Moses, Cre4T3Tiv3},
title = {Unsloth LoRA Adapter for LLaMA 3 (8B)},
year = {2025},
url = {https://huggingface.co/Cre4T3Tiv3/unsloth-llama3-alpaca-lora},
}
License
Apache 2.0
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Dataset used to train Cre4T3Tiv3/unsloth-llama3-alpaca-lora
Evaluation results
- Hallucination Detection (QLoRA-specific)self-reportedmitigated
- Instruction Match Scoreself-reported2.3 / 3
- Output Quality (manual)self-reportedpass