This model is a result of applying DPO
on petkopetkov/Qwen2.5-0.5B-Instruct-med-diagnosis
using the relatively small dataset available as nuriyev/medical-question-answering-rl-labeled-qwen-0.5B-binarized_v2
at HuggingFace.
It was evaluated using qualitative ranking and in some cases slightly outperforms the original petkopetkov/Qwen2.5-0.5B-Instruct-med-diagnosis
.
The following web interface https://github.com/MahammadNuriyev62/doctor-llm was developed to test the model, make it publicly available and further collect the user feedback for further RLHF.
Usage
pip install -U transformers
Run with the pipeline API
from transformers import pipeline
import torch
system_prompt = (
"You are a medical assistant trained to provide general health information. "
"Follow these rules:\n"
"1. Only answer the question asked.\n"
"2. Do not deviate from medical facts.\n"
"3. Be concise and accurate."
)
prompt = "What is contact dermatitis, and what are some of the typical symptoms associated with this condition, including the type of hypersensitivity reaction that causes it?"
chat = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": prompt},
]
pipe = pipeline(
task="text-generation",
model="nuriyev/Qwen2.5-0.5B-Instruct-medical-dpo",
torch_dtype=torch.bfloat16,
device_map="auto",
max_new_tokens=1024,
)
response = pipe(chat)
print(response[0]["generated_text"][0])
Training



- Downloads last month
- 13
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support