JayHyeon
/
Qwen_1.5B-math-rDPO_5e-7_0.1lsmooth-1.0vpo_constant-1ep
like
0
Model card
Files
Files and versions
Metrics
Training metrics
Community