🧠 AlphaMed

This is the official model checkpoint for the paper:
AlphaMed: Incentivizing Medical Reasoning with minimalist Rule-Based RL
AlphaMed is a medical large language model trained without supervised fine-tuning on chain-of-thought (CoT) data, relying solely on reinforcement learning to elicit step-by-step reasoning in complex medical tasks.

🚀 Usage

To use the model, format your input prompt as:

Question: [your medical question here]
Please reason step by step, and put the final answer in \boxed{}

🔬 Example

from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

# Load model and tokenizer
model_id = "che111/AlphaMed-3B-instruct-rl"  # Replace with actual repo path
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)

# Format question
prompt = (
    "Question: A 45-year-old patient presents with chest pain radiating to the left arm and elevated troponin levels. "
    "What is the most likely diagnosis?\n"
    "Please reason step by step, and put the final answer in \\boxed{}"
)

# Generate output
max_new_tokens=8196
output = pipe(prompt, max_new_tokens=max_new_tokens, do_sample=False)[0]["generated_text"]
print(output)

che111
/

AlphaMed-7B-instruct-rl

🧠 AlphaMed

🚀 Usage

🔬 Example

Model tree for che111/AlphaMed-7B-instruct-rl

Collection including che111/AlphaMed-7B-instruct-rl

AlphaMed