VerlTool/acecoder-fsdp_agent-qwen_qwen2.5-coder-1.5b-instruct-grpo-69k-sys12-mtrl-d1fo-535-step 2B • Updated 29 days ago • 9
VerlTool/acecoder-fsdp_agent-qwen_qwen2.5-coder-1.5b-instruct-grpo-69k-sys12-mtrl-d1fo-280-step 2B • Updated 29 days ago • 12
VerlTool/torl-deep_math-fsdp_agent-qwen2.5-math-7b-grpo-n16-b128-t1.0-lr1e-6-310-step 8B • Updated 30 days ago • 36
VerlTool/torl-deep_math-fsdp_agent-qwen2.5-math-1.5b-grpo-n16-b128-t1.0-lr1e-6-320-step 2B • Updated May 27 • 32
VerlTool/torl-deep_math-fsdp-qwen2.5-math-1.5b-grpo-n16-b128-t1.0-lr1e-6-830-step 2B • Updated May 25 • 11
VerlTool/acecoder-fsdp_agent-mimo-7b-base-grpo-n16-b128-t1.0-lr1e-6-69k-mtrl-sys9-new2-debug-120-step 8B • Updated May 16 • 9
VerlTool/acecoder-fsdp_agent-xiaomimimo_mimo-7b-base-grpo-n16-b128-t1.0-lr1e-6-69k-2turn-sys4-120-step 8B • Updated May 16 • 7
VerlTool/torl-fsdp_agent-qwen_qwen2.5-math-7b-grpo-n16-b128-t1.0-lr1e-6-mtrl-v6-330-step 8B • Updated May 15 • 9
VerlTool/acecoder-fsdp-xiaomimimo_mimo-7b-base-grpo-n16-b128-t1.0-lr1e-6-69k-sys3-no-tool-110-step Updated May 15
VerlTool/acecoder-fsdp_agent-qwen_qwen2.5-coder-1.5b-grpo-n16-b128-t1.0-lr1e-6-69k-mtrl-sys8-110-step 2B • Updated May 12 • 10