AzalKhan/Qwen2.5-1.5B-Instruct_open-r1-DAPO-Math-17k-Processed_294 Reinforcement Learning • 2B • Updated Oct 10 • 17
AzalKhan/Qwen2.5-1.5B-Instruct_open-r1-DAPO-Math-17k-Processed_588 Reinforcement Learning • 2B • Updated Oct 5 • 7
AzalKhan/Qwen2.5-1.5B-Instruct_open-r1-DAPO-Math-17k-Processed_1 Reinforcement Learning • 2B • Updated Oct 3 • 1
AzalKhan/Qwen2.5-1.5B-Instruct_open-r1-DAPO-Math-17k-Processed_882 Reinforcement Learning • 2B • Updated Oct 7 • 297
AzalKhan/Qwen2.5-1.5B_open-r1-DAPO-Math-17k-Processed_294 Reinforcement Learning • 2B • Updated Oct 8 • 9
AzalKhan/Qwen2.5-1.5B_open-r1-DAPO-Math-17k-Processed_588 Reinforcement Learning • 2B • Updated Oct 8 • 7
AzalKhan/Qwen2.5-1.5B_open-r1-DAPO-Math-17k-Processed_882 Reinforcement Learning • 2B • Updated Oct 8 • 12
hdong0/Qwen3-8B-base-Open-R1-GRPO_dapo_acc_2048_to_16384_nokl Text Generation • 8B • Updated Oct 12 • 16
hdong0/Qwen3-8B-base-Open-R1-GRPO_dapo_acc_4096_to_16384_nokl Text Generation • 8B • Updated Oct 14 • 6
hdong0/Qwen3-8B-base-Open-R1-GRPO_dapo_acc_8192_to_16384_nokl Text Generation • 8B • Updated Oct 15 • 12
AzalKhan/Qwen2.5-1.5B-Instruct_BF16_open-r1-DAPO-Math-17k-Processed_294_FlashRL_G4-L1024 Reinforcement Learning • 2B • Updated 28 days ago • 13
AzalKhan/Qwen2.5-1.5B-Instruct_BF16_open-r1-DAPO-Math-17k-Processed_588_FlashRL_G4-L1024 Reinforcement Learning • 2B • Updated 28 days ago • 22
AzalKhan/Qwen2.5-1.5B-Instruct_BF16_open-r1-DAPO-Math-17k-Processed_882_FlashRL_G4-L1024 Reinforcement Learning • 2B • Updated 28 days ago • 25
AzalKhan/Qwen2.5-1.5B-Instruct_BF16_open-r1-DAPO-Math-17k-Processed_1176_FlashRL_G4-L1024 Reinforcement Learning • 2B • Updated 28 days ago • 163
AzalKhan/Qwen2.5-1.5B-Instruct_BF16_open-r1-DAPO-Math-17k-Processed_294_FlashRL_G4-L2048_new Reinforcement Learning • 2B • Updated 27 days ago • 502
AzalKhan/Qwen2.5-1.5B-Instruct_BF16_open-r1-DAPO-Math-17k-Processed_588_FlashRL_G4-L2048_new Reinforcement Learning • 2B • Updated 27 days ago • 361
AzalKhan/Qwen2.5-1.5B-Instruct_BF16_open-r1-DAPO-Math-17k-Processed_882_FlashRL_G4-L2048_new Reinforcement Learning • 2B • Updated 27 days ago • 359
AzalKhan/Qwen2.5-1.5B-Instruct_BF16_open-r1-DAPO-Math-17k-Processed_1176_FlashRL_G4-L2048_new Reinforcement Learning • 2B • Updated 26 days ago • 507