🩺 MediMaven Llama-3.1-8B (fp16, v1.1)

A domain-adapted Llama-3 fine-tuned on ~150 k high-quality Q&A pairs, merged to full-precision fp16 weights for maximum downstream flexibility.


✨ Key points

Base model Meta-Llama-3-8B
Tuning method QLoRA (8-bit) β†’ merge to fp16
Training data Curated MedQuAD v2, scrapped articles from Mayo Clinic, NIH, NHS and WEBMD
Intended use Medical information retrieval, summarisation, chat

Disclaimer Outputs are informational and do not constitute medical advice.


πŸ”₯ Quick start

from transformers import AutoTokenizer, AutoModelForCausalLM
tok = AutoTokenizer.from_pretrained("dranreb1660/medimaven-llama3-8b-fp16")
model = AutoModelForCausalLM.from_pretrained(
    "dranreb1660/medimaven-llama3-8b-fp16",
    torch_dtype="float16",
    device_map="auto"
)
prompt = "Explain first-line treatment for GERD in two sentences."
print(tok.decode(model.generate(**tok(prompt, return_tensors="pt").to(model.device),
                                 max_new_tokens=64)[0],
                 skip_special_tokens=True))

πŸ“Š Evaluation

Metric Clean Llama-3 8 B MediMaven
Medical MC-QA (exact-match) 78.4 89.7
F1 (MedQA-RAG) † 0.71 0.83

πŸ› οΈ How we trained

  • Built dataset with de-duplicated, source-attributed passages (MedQuAD, Mayo, iCliniq) check dataset card for more info.

  • Applied QLoRA (32 β†’ 4 bit) on NVIDIA T4, 3-epoch, LR 3e-5, cosine schedule.

  • Merged LoRA adapters to fp16; ran AWQ (see separate repo) for prod inference.

Full training notebook

🚦 Limitations & bias

  • Llama-3 license prohibits use in regulated "high-risk" settings.

  • English-only; no guarantee of safe output in other languages.

⬆️ Versioning

  • v1.1 = first public release (merged weights, new tokenizer template).
  • For lighter deployment see medimaven-llama3-8b-awq

πŸ“œ Citation

@misc{medimaven2025llama3,
  title        = {MediMaven Llama-3.1-8B},
  author       = {Kyei-Mensah, Bernard},
  year         = {2025},
  howpublished = {\url{https://huggingface.co/dranreb1660/medimaven-llama3-8b-fp16}}
}
Downloads last month
26
Safetensors
Model size
8.03B params
Tensor type
F16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for dranreb1660/medimaven-llama3-8b-fp16

Finetuned
(444)
this model
Finetunes
1 model