--- license: mit --- # 🧠 AlphaMed This is the official model checkpoint for the paper: **[AlphaMed: Incentivizing Medical Reasoning with minimalist Rule-Based RL](https://www.arxiv.org/abs/2505.17952)** AlphaMed is a medical large language model trained **without supervised fine-tuning on chain-of-thought (CoT) data**, relying solely on reinforcement learning to elicit step-by-step reasoning in complex medical tasks. ## 🚀 Usage To use the model, format your input prompt as: > **Question:** [your medical question here] > **Please reason step by step, and put the final answer in \boxed{}** ### 🔬 Example ```python from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline # Load model and tokenizer model_id = "che111/AlphaMed-3B-instruct-rl" # Replace with actual repo path tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id) pipe = pipeline("text-generation", model=model, tokenizer=tokenizer) # Format question prompt = ( "Question: A 45-year-old patient presents with chest pain radiating to the left arm and elevated troponin levels. " "What is the most likely diagnosis?\n" "Please reason step by step, and put the final answer in \\boxed{}" ) # Generate output max_new_tokens=8196 output = pipe(prompt, max_new_tokens=max_new_tokens, do_sample=False)[0]["generated_text"] print(output)