Qwen2.5-7B-GRPO-1M-Context-Medical-Reasoning-f16-v2 / model-00002-of-00004.safetensors

Commit History

Trained with Unsloth
64ec39a
verified

dumbequation commited on