MNLP_M3_mcqa_dpo_model

This model is a fine-tuned version of AnnaelleMyriam/MNLP_M3_sft_dpo_1024_beta0.5_2e-5_FINAL_v3_16_check1500 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3494

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 2
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
0.541 0.0811 150 0.4871
0.3978 0.1622 300 0.4650
0.4109 0.2433 450 0.4297
0.4848 0.3244 600 0.4074
0.4588 0.4055 750 0.3867
0.4039 0.4866 900 0.3828
0.3221 0.5677 1050 0.4007
0.3642 0.6488 1200 0.3854
0.3558 0.7299 1350 0.4022
0.3155 0.8110 1500 0.3775
0.4315 0.8921 1650 0.3692
0.3845 0.9732 1800 0.3586
0.4821 1.0541 1950 0.3639
0.3883 1.1352 2100 0.3683
0.3996 1.2163 2250 0.3670
0.4104 1.2974 2400 0.3365
0.4321 1.3785 2550 0.3496
0.3271 1.4596 2700 0.3394
0.3327 1.5407 2850 0.3544
0.2663 1.6218 3000 0.3632
0.5097 1.7029 3150 0.3435
0.4855 1.7840 3300 0.3344
0.1663 1.8651 3450 0.3521
0.3408 1.9462 3600 0.3551
0.2752 2.0270 3750 0.3448
0.4994 2.1081 3900 0.3552
0.4012 2.1892 4050 0.3537
0.1766 2.2703 4200 0.3596
0.3081 2.3514 4350 0.3584
0.2448 2.4325 4500 0.3595
0.3791 2.5137 4650 0.3547
0.3062 2.5948 4800 0.3501
0.2908 2.6759 4950 0.3472
0.3918 2.7570 5100 0.3470
0.3629 2.8381 5250 0.3479
0.2431 2.9192 5400 0.3487
0.1877 3.0 5550 0.3494

Framework versions

  • PEFT 0.15.2
  • Transformers 4.52.4
  • Pytorch 2.7.1+cu126
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
66
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for aymanbakiri/MNLP_M3_mcqa_dpo_model