hdong0/deepseek-Qwen-1.5B-batch-mix-GRPO_deepscaler_acc_seq_end_mask_thin_mu_8_warmed_4x4 Text Generation • 2B • Updated 10 days ago • 57
hdong0/deepseek-Qwen-1.5B-batch-mix-GRPO_deepscaler_acc_seq_end_mask_thin_mu_8_warmed_4x4_plus Text Generation • 2B • Updated 6 days ago • 74