JayHyeon
/
Qwen_0.5-rDPO_3e-6_1.0vpo_constant-1ep_0.3flip

Model card Files Files and versions Metrics Training metrics Community