hafidhsoekma/dr_grpo-qwen-3_value_function-deepmath-1.7B Text Generation • 2B • Updated 5 days ago • 2
hafidhsoekma/grpo-qwen-3_value_function-deepscaler-1.7B Text Generation • 2B • Updated 8 days ago • 11
hafidhsoekma/grpo-gemini-8_value_function-deep_math-0.6B Text Generation • 0.6B • Updated 8 days ago • 29
hafidhsoekma/Qwen2.5-VL-7B-Instruct-MedTrinity-25M-demo-shuffle-45000-indonesian Image-to-Text • Updated May 29