ASethi04/meta-llama-Llama-3.1-8B-Instruct-dpo-HumanLLMs-Human-Like-DPO-Dataset-first-2-5e-06 Updated May 14
ASethi04/meta-llama-Llama-3.1-8B-Instruct-dpo-HumanLLMs-Human-Like-DPO-Dataset-second-2-5e-06 Updated May 15
ASethi04/meta-llama-Llama-3.1-8B-Instruct-dpo-HumanLLMs-Human-Like-DPO-Dataset-third-2-5e-06 Updated May 15
ASethi04/meta-llama-Llama-3.1-8B-Instruct-dpo-trl-lib-ultrafeedback_binarized-first-2-5e-06 Updated May 15
ASethi04/meta-llama-Llama-3.1-8B-Instruct-dpo-trl-lib-ultrafeedback_binarized-second-2-5e-06 Updated May 15
ASethi04/meta-llama-Llama-3.1-8B-Instruct-dpo-trl-lib-ultrafeedback_binarized-third-2-5e-06 Updated May 15
ASethi04/meta-llama-Llama-3.1-8B-Instruct-dpo-abacusai-MetaMath_DPO_FewShot-third-2-5e-06 Updated May 15
ASethi04/meta-llama-Llama-3.1-8B-Instruct-dpo-abacusai-MetaMath_DPO_FewShot-second-2-5e-06 Updated May 15
ASethi04/meta-llama-Llama-3.1-8B-Instruct-dpo-abacusai-MetaMath_DPO_FewShot-first-2-5e-06 Updated May 15