🧠 Qwen3-0.6B-MedicalDataset-GGUF

A quantized GGUF-format version of Qwen3-0.6B, fine-tuned on a medical dataset to assist with healthcare-related tasks. Packaged in GGUF format for use with efficient inference engines like llama.cpp. Released by XformAI-India.

📌 Model Details

Base Model: Qwen3-0.6B
Format: GGUF (quantized)
Quantization Types: Multiple (Q4_K_M, Q5_K_M, Q6_K, Q8_0, etc.)
Precision: 2-8 bit quantized
Use Case: Low-resource and edge device inference for medical AI applications

🧪 Intended Use

This quantized model is intended for:

Medical Q&A on low-resource devices
Offline chatbot usage in healthcare education
Mobile inference for healthcare reasoning

🚫 Limitations & Disclaimer

⚠️ This model is not intended for clinical use.

Not suitable for real-time diagnostics or emergency decisions.
May produce inaccurate or hallucinated medical information.
Use for research and prototyping only.

🛠 How to Use

Run with llama.cpp:

./main -m qwen3-0.6b-medical-q4_k_m.gguf -p "Explain symptoms of hypertension."

Or from Python using llama-cpp-python:

from llama_cpp import Llama

llm = Llama(model_path="qwen3-0.6b-medical-q4_k_m.gguf")
output = llm("What are treatment options for Type 2 Diabetes?", max_tokens=200)
print(output)

🏗 Training Info (Base Fine-Tuning)

Dataset: FreedomIntelligence/medical-o1-reasoning-SFT
Epochs: 3
Batch Size: 8
Learning Rate: 2e-5
Framework: PyTorch + Transformers

🧠 Citation

If you use this model, please cite:

@misc{qwen3medicalgguf2025,
  title={Qwen3-0.6B-MedicalDataset-GGUF: A Quantized Medical AI Model},
  author={XformAI-India},
  year={2025},
  url={https://huggingface.co/XformAI-india/Qwen3-0.6B-medicaldataset-gguf}
}

XformAI-india
/

Qwen3-0.6B-medicaldataset-gguf