π©Ί MediMaven Llama-3.1-8B (fp16, v1.1)
A domain-adapted Llama-3 fine-tuned on ~150 k high-quality Q&A pairs, merged to full-precision fp16 weights for maximum downstream flexibility.
β¨ Key points
Base model | Meta-Llama-3-8B |
Tuning method | QLoRA (8-bit) β merge to fp16 |
Training data | Curated MedQuAD v2, scrapped articles from Mayo Clinic, NIH, NHS and WEBMD |
Intended use | Medical information retrieval, summarisation, chat |
DisclaimerβOutputs are informational and do not constitute medical advice.
π₯ Quick start
from transformers import AutoTokenizer, AutoModelForCausalLM
tok = AutoTokenizer.from_pretrained("dranreb1660/medimaven-llama3-8b-fp16")
model = AutoModelForCausalLM.from_pretrained(
"dranreb1660/medimaven-llama3-8b-fp16",
torch_dtype="float16",
device_map="auto"
)
prompt = "Explain first-line treatment for GERD in two sentences."
print(tok.decode(model.generate(**tok(prompt, return_tensors="pt").to(model.device),
max_new_tokens=64)[0],
skip_special_tokens=True))
π Evaluation
Metric | Clean Llama-3 8 B | MediMaven |
---|---|---|
Medical MC-QA (exact-match) | 78.4 | 89.7 |
F1 (MedQA-RAG) β | 0.71 | 0.83 |
π οΈ How we trained
Built dataset with de-duplicated, source-attributed passages (MedQuAD, Mayo, iCliniq) check dataset card for more info.
Applied QLoRA (32 β 4 bit) on NVIDIA T4, 3-epoch, LR 3e-5, cosine schedule.
Merged LoRA adapters to fp16; ran AWQ (see separate repo) for prod inference.
π¦ Limitations & bias
Llama-3 license prohibits use in regulated "high-risk" settings.
English-only; no guarantee of safe output in other languages.
β¬οΈ Versioning
- v1.1 = first public release (merged weights, new tokenizer template).
- For lighter deployment see medimaven-llama3-8b-awq
π Citation
@misc{medimaven2025llama3,
title = {MediMaven Llama-3.1-8B},
author = {Kyei-Mensah, Bernard},
year = {2025},
howpublished = {\url{https://huggingface.co/dranreb1660/medimaven-llama3-8b-fp16}}
}
- Downloads last month
- 26
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support