shubhamprshr/Qwen2.5-3B-Instruct_math_sgrpo_cosine_0.5_0.5_True_1200 Text Generation • Updated May 8 • 8
citrinegui/Qwen2.5-3B-Instruct_countdown6_grpo_balanced_0.5_0.5_True_1600 Text Generation • Updated May 8 • 5
Grogros/dmWM-Qwen-Qwen2.5-3B-Instruct-WOHealth-Al4-NH-WO-d4-a0.2-v4 Text Generation • Updated May 8 • 7
Grogros/dmWM-Qwen-Qwen2.5-3B-Instruct-WOHealth-Al4-NH-WO-TV Text Generation • Updated about 1 month ago • 30
Grogros/dmWM-Qwen-Qwen2.5-3B-Instruct-OMI-Al4-OWT-TV Text Generation • Updated about 1 month ago • 41
Grogros/dmWM-Qwen-Qwen2.5-3B-Instruct-OWT-Al4-OMI-LucieFr-HA-WOHealth-Full Text Generation • Updated about 1 month ago • 29
sieufgsb9dv77w-94r/unuse_ego-r1_train_lr2e-5_epochs3_20250506_225733 Text Generation • Updated about 1 month ago • 4
krtanmay147/qwen2.5-3B-r1-factual-qa_steps_1000_batch_2_promptv10_3_rewards_new_data_2 Updated 29 days ago
krtanmay147/qwen2.5-3B-r1-factual-qa_steps_1000_batch_2_promptv12_3_rewards_new_data_template_1_no_quant Updated 28 days ago