Model Card for MediLlama-3.2

A fine-tuned version of Meta's LLaMA 3.2 (3B Instruct) for domain-specific applications in healthcare and medicine. This model is optimized for tasks such as medical Q&A, symptom checking, and patient education.

Model Details

Model Description

This model is a domain-adapted version of LLaMA 3.2 3B Instruct. It has been fine-tuned using supervised fine-tuning (SFT) on medical datasets to handle English-language healthcare scenarios including diagnostic queries, treatment suggestions, and general medical advice.

Developed by: InferenceLab
Model type: Medical Chatbot
Language(s) (NLP): English
License: Apache 2.0
Finetuned from model: meta-llama/Llama-3.2-3B-Instruct

Uses

Direct Use

MediLlama-3.2 can be used directly as a chatbot or virtual assistant in medical and health-related applications. Ideal for educational content, initial symptom triage, and research purposes.

Downstream Use

Can be integrated into larger telehealth systems, clinical documentation tools, or diagnostic assistants after further task-specific fine-tuning.

Out-of-Scope Use

Should not be used for real-time diagnosis or treatment decisions without expert validation.
Not suitable for high-risk or life-threatening emergency response.
Not trained on pediatric or highly specialized medical domains.

Bias, Risks, and Limitations

While the model is trained on medical data, it may still exhibit:

Biases from source data
Hallucinations or incorrect suggestions
Outdated or non-region-specific medical advice

Recommendations

Users should validate outputs with certified medical professionals. This model is for research and prototyping only, not for clinical deployment without regulatory compliance.

How to Get Started with the Model

import torch
from transformers import pipeline

model_id = "InferenceLab/MediLlama-3.2"
pipe = pipeline(
    "text-generation",
    model=model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

messages = [
    {"role": "system", "content": "You are a helpful Medical assistant."},
    {"role": "user", "content": "Hi! How are you?"},
]
outputs = pipe(
    messages,
    max_new_tokens=256,
)
print(outputs[0]["generated_text"][-1])

Training Details

Training Data

Model trained using cleaned and preprocessed medical QA datasets, synthetic doctor-patient conversations, and publicly available health forums. Protected health information (PHI) was removed.

Training Procedure

Supervised fine-tuning (SFT) using TRL and Unsloth libraries.

Preprocessing

Tokenization using LLaMA tokenizer with special medical instruction formatting.

Training Hyperparameters

Training regime: bf16 mixed precision
Learning rate: 1e-5

Speeds, Sizes, Times

Training time: ~12 hours on 4×A100 GPUs

Evaluation

Testing Data, Factors & Metrics

Testing Data

Subset of unseen medical QA pairs, synthetic test cases, and MedQA-derived examples.

Factors

Input prompt complexity
Use of medical terminology
Chat length

Metrics

Accuracy: 81.3%
BLEU: 34.5
ROUGE-L: 62.2

Results

Summary

Model shows good generalization to unseen prompts and performs competitively for general medical dialogue. Further tuning needed for specialty areas like oncology or rare diseases.

Model Examination

Explainability tools like LLaMA-MedLens (if available) are suggested to interpret model decisions.

Environmental Impact

Hardware Type: 4×NVIDIA A100 40GB
Hours used: 12
Cloud Provider: AWS
Compute Region: us-west-2
Carbon Emitted: ~35.8 kg CO2eq (estimated)

Technical Specifications

Model Architecture and Objective

Based on Meta LLaMA 3.2 3B Instruct
Decoder-only transformer
Objective: Causal Language Modeling (CLM) with instruction fine-tuning

Compute Infrastructure

Hardware

4×NVIDIA A100 40GB

Software

Python 3.10
Transformers (v4.40+)
TRL
Unsloth
PyTorch 2.1

Glossary

SFT: Supervised Fine-Tuning
BLEU: Bilingual Evaluation Understudy
ROUGE: Recall-Oriented Understudy for Gisting Evaluation

More Information

For collaborations, deployment help, or fine-tuning extensions, please contact the developers.

Model Card Authors

InferenceLab Team

InferenceLab
/

MediLlama-3.2