Model Details
Model Description
Llama-3.1-8B model trained with ORPO trainer.
Training Details
Training Data
mlabonne/orpo-dpo-mix-40k is used for finetuning this model.
[More Information Needed]
Training Procedure
Trained with ORPO trainer, and only first 5K rows are used for finetuning (5K out of 40K).
- Downloads last month
- 118
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
馃檵
Ask for provider support