🩺 DeepSeek 16B Medical GPT (QLoRA Fine-Tuned)

darkknight25/deepseek-16b-medical-GPT is a fine-tuned version of deepseek-ai/deepseek-l6b-moe-chat, optimized for medical question answering, reasoning, and clinical summarization using QLoRA and open-access healthcare datasets.

This model uses Mixture-of-Experts (MoE) architecture with QLoRA-based adaptation, unlocking medical domain performance with efficient training.

🧠 Model Details

Base Model: deepseek-16b-moe-chat
Fine-Tuning Method: QLoRA (4-bit quantized)
Adapter Method: LoRA via peft
Trainable Parameters: ~85M
Quantization: 4-bit NF4 using bitsandbytes

🧬 Base Model: `deepseek-ai/deepseek-moe-16b-chat`

16B parameter MoE with 2-of-8 experts active per token
Low inference cost (~6B active)
Strong reasoning, multi-domain instruction-tuned
Trained on 2T tokens, multilingual support

🧪 Medical Fine-Tuning Setup

Technique: QLoRA (4-bit quantization) + LoRA adapters
Target Modules: q_proj, k_proj, v_proj, o_proj
Trainable Params: ~85M
Batch Size: 4
Epochs: 2
Optim: paged_adamw_8bit

📚 Training Datasets

Fine-tuned using a diverse set of public medical datasets:

Dataset	Description
`pubmed_qa`	Biomedical QA pairs
`medmcqa`	Indian medical entrance exam questions
`ccdv/pubmed-summarization`	Clinical note → abstract summaries
`ohsumed`	Medical literature abstracts

🏥 Use Cases

This model is best suited for:

Clinical decision support
Biomedical Q&A
Medical reasoning
Summarizing clinical notes
Patient education bots

🔁 Inference Example

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "darkknight25/deepseek-16b-medical-GPT"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")

prompt = "Q: What is the treatment for bacterial meningitis?\nA:"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=150)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

🧪 Evaluation (coming soon)

Evaluation on:

MedMCQA (Accuracy %)

PubMedQA (MCQ performance)

USMLE-style clinical cases

Want to contribute eval scripts? PRs welcome.

🔐 License

MIT License (same as base model). Use for research and commercial purposes freely, with proper attribution. 🙏 Acknowledgements

Base model: DeepSeek LLM 16B Chat

Hugging Face PEFT, Datasets, Transformers

Medical datasets: PubMedQA, MedMCQA, PubMed, OHSUMED

🤖 Author

@darkknight25 – Security Researcher | ML Engineer | Medical AI

Want to collaborate on more medical AI projects or build a chatbot? Ping me.

darkknight25
/

deepseek-16b-medical-GPT

🩺 DeepSeek 16B Medical GPT (QLoRA Fine-Tuned)

🧠 Model Details

🧬 Base Model: `deepseek-ai/deepseek-moe-16b-chat`

🧪 Medical Fine-Tuning Setup

📚 Training Datasets

🏥 Use Cases

🔁 Inference Example

Model tree for darkknight25/deepseek-16b-medical-GPT

Datasets used to train darkknight25/deepseek-16b-medical-GPT

🩺 DeepSeek 16B Medical GPT (QLoRA Fine-Tuned)

🧠 Model Details

🧬 Base Model: deepseek-ai/deepseek-moe-16b-chat

🧪 Medical Fine-Tuning Setup

📚 Training Datasets

🏥 Use Cases

🔁 Inference Example

Model tree for darkknight25/deepseek-16b-medical-GPT

Datasets used to train darkknight25/deepseek-16b-medical-GPT

🧬 Base Model: `deepseek-ai/deepseek-moe-16b-chat`