dumbequation/Qwen2.5-7B-GRPO-1M-Context-Medical-Reasoning-f16 Text Generation • Updated Mar 4 • 2 • 1
dumbequation/Qwen2.5-7B-GRPO-1M-Context-Medical-Reasoning-f16-v2 Text Generation • Updated Mar 4 • 5 • 1
Medical LLMs Collection My experiments to push AI in Medicine, not to replace doctors but to empower them • 4 items • Updated Mar 13
Reasoning Work Collection Models I've trained to think like DeepSeek R1 using online learning - Group Relative Policy Optimization (GRPO) introduced by DeepSeekMath • 6 items • Updated Mar 13
dumbequation/Qwen2.5-7B-GRPO-1M-Context-Medical-Reasoning-f16-v2 Text Generation • Updated Mar 4 • 5 • 1