mistral-7b-magyar-portas-final

This model is a fine-tuned version of mistralai/Mistral-7B-Instruct-v0.2 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0695

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 4
  • optimizer: Use OptimizerNames.ADAMW_8BIT with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 10
  • num_epochs: 3
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.167 0.1025 50 0.2579
0.1552 0.2049 100 0.1970
0.1636 0.3074 150 0.1660
0.1066 0.4098 200 0.1523
0.1183 0.5123 250 0.1429
0.0731 0.6148 300 0.1291
0.1209 0.7172 350 0.1183
0.1124 0.8197 400 0.1152
0.0654 0.9221 450 0.1132
0.0854 1.0246 500 0.1094
0.0812 1.1270 550 0.1041
0.0918 1.2295 600 0.1018
0.0744 1.3320 650 0.0996
0.064 1.4344 700 0.0952
0.076 1.5369 750 0.0950
0.0503 1.6393 800 0.0911
0.0624 1.7418 850 0.0868
0.0588 1.8443 900 0.0834
0.0558 1.9467 950 0.0826
0.0363 2.0492 1000 0.0812
0.0444 2.1516 1050 0.0800
0.0395 2.2541 1100 0.0790
0.0449 2.3566 1150 0.0771
0.042 2.4590 1200 0.0759
0.0611 2.5615 1250 0.0744
0.0431 2.6639 1300 0.0725
0.0411 2.7664 1350 0.0711
0.0411 2.8689 1400 0.0700
0.0353 2.9713 1450 0.0695

Framework versions

  • PEFT 0.15.2
  • Transformers 4.52.2
  • Pytorch 2.6.0+cu124
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
201
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ikerion/mistral-7b-magyar-portas-final

Adapter
(946)
this model