CodeDPO/qwen25-ins-7b-coderm_new_margin_scalebt-7b-reinforce_plus_new_dataset Updated 24 days ago • 29
CodeDPO/qwen25-ins-7b-coderm_new_margin_scalebt-7b-reinforce-plus-episode_1 Text Generation • Updated 25 days ago • 26
CodeDPO/AceCoderV2-mini-processed_openrlhf_format_r1 Viewer • Updated about 20 hours ago • 26.4k • 44