jiminmun/llama-3.2-3b_ppo_lr5e-07_rm_avg_no_sys_msg_filtered Text Generation • 3B • Updated Feb 13 • 11
jiminmun/llama-3.2-3b_ppo_lr5e-07_rm_data-mix_no_sys_msg_filtered Text Generation • 3B • Updated Feb 13 • 13
jiminmun/llama-3.2-3b_reward_model_data_mix_lr9e-6_no_sys_msg_filtered Text Classification • 3B • Updated Feb 12 • 7
jiminmun/llama-3.2-3b_ppo_lr5e-07_rm_avg_w_sys_msg_unfiltered Text Classification • 3B • Updated Feb 10 • 20
jiminmun/llama-3.2-3b_ppo_lr5e-07_rm_data-mix_no_sys_msg_unfiltered Text Classification • 3B • Updated Feb 10 • 20
jiminmun/llama-3.2-3b_ppo_lr5e-07_rm_data-mix_w_sys_msg_unfiltered Text Classification • 3B • Updated Feb 10 • 13
jiminmun/llama-3.2-3b_reward_model_clarity_lr9e-6_no_sys_msg_filtered Text Classification • 3B • Updated Feb 9 • 8
jiminmun/llama-3.2-3b_reward_model_focus_lr9e-6_no_sys_msg_filtered Text Classification • 3B • Updated Feb 9 • 8
jiminmun/llama-3.2-3b_reward_model_relevance_lr9e-6_no_sys_msg_filtered Text Classification • 3B • Updated Feb 9 • 10
jiminmun/llama-3.2-3b_reward_model_avoidbias_lr9e-6_no_sys_msg_filtered Text Classification • 3B • Updated Feb 9 • 8
jiminmun/llama-3.2-3b_reward_model_answerability_lr9e-6_no_sys_msg_filtered Text Classification • 3B • Updated Feb 9 • 7
jiminmun/llama-3.2-3b_reward_model_accuracy_lr9e-6_no_sys_msg_filtered Text Classification • 3B • Updated Feb 9 • 10
jiminmun/llama-3.2-3b_reward_model_clarity_lr9e-6_no_sys_msg_unfiltered Text Classification • 3B • Updated Feb 8 • 29
jiminmun/llama-3.2-3b_reward_model_data-mix_lr9e-6_no_sys_msg_unfiltered Text Classification • 3B • Updated Feb 7 • 8
jiminmun/llama-3.2-3b_reward_model_accuracy_lr9e-6_no_sys_msg_unfiltered Text Classification • 3B • Updated Feb 7 • 64
jiminmun/llama-3.2-3b_reward_model_answerability_lr9e-6_no_sys_msg_unfiltered Text Classification • 3B • Updated Feb 7 • 8
jiminmun/llama-3.2-3b_reward_model_avoidbias_lr9e-6_no_sys_msg_unfiltered Text Classification • 3B • Updated Feb 7 • 7
jiminmun/llama-3.2-3b_reward_model_focus_lr9e-6_no_sys_msg_unfiltered Text Classification • 3B • Updated Feb 7 • 8
jiminmun/llama-3.2-3b_reward_model_relevance_lr9e-6_no_sys_msg_unfiltered Text Classification • 3B • Updated Feb 7 • 24
jiminmun/llama-3.2-3b_reward_model_focus_lr9e-6_w_sys_msg_unfiltered Text Classification • 3B • Updated Feb 7 • 9
jiminmun/llama-3.2-3b_reward_model_answerability_lr9e-6_w_sys_msg_unfiltered Text Classification • 3B • Updated Feb 7 • 9
jiminmun/llama-3.2-3b_reward_model_avoidbias_lr9e-6_w_sys_msg_unfiltered Text Classification • 3B • Updated Feb 7 • 9
jiminmun/llama-3.2-3b_reward_model_accuracy_lr9e-6_w_sys_msg_unfiltered Text Classification • 3B • Updated Feb 7 • 9
jiminmun/llama-3.2-3b_reward_model_relevance_lr9e-6_w_sys_msg_unfiltered Text Classification • 3B • Updated Feb 7 • 8
jiminmun/llama-3.2-3b_reward_model_clarity_lr9e-6_w_sys_msg_unfiltered Text Classification • 3B • Updated Feb 7 • 9