AIPlans/qwen3-8b-ipo-hh-rlhf
Text Generation
•
Updated
•
6
Different versions of Qwen 0.6b, where the only difference is the post training method used. The post training database should be the hh rlhf dataset.