DPO model excluding the noisy preference pairs for Mistral-Base under trl/ultradeedback_binarized finetuning.

Downloads last month
5
Safetensors
Model size
7.24B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ComparisonPO/Mistral-Base-7B-DPO_clean

Finetuned
(376)
this model
Finetunes
1 model

Dataset used to train ComparisonPO/Mistral-Base-7B-DPO_clean