mistral-7b-magyar-portas-final

This model is a fine-tuned version of mistralai/Mistral-7B-Instruct-v0.2 on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.0695

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 1
eval_batch_size: 1
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 4
optimizer: Use OptimizerNames.ADAMW_8BIT with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 10
num_epochs: 3
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
0.167	0.1025	50	0.2579
0.1552	0.2049	100	0.1970
0.1636	0.3074	150	0.1660
0.1066	0.4098	200	0.1523
0.1183	0.5123	250	0.1429
0.0731	0.6148	300	0.1291
0.1209	0.7172	350	0.1183
0.1124	0.8197	400	0.1152
0.0654	0.9221	450	0.1132
0.0854	1.0246	500	0.1094
0.0812	1.1270	550	0.1041
0.0918	1.2295	600	0.1018
0.0744	1.3320	650	0.0996
0.064	1.4344	700	0.0952
0.076	1.5369	750	0.0950
0.0503	1.6393	800	0.0911
0.0624	1.7418	850	0.0868
0.0588	1.8443	900	0.0834
0.0558	1.9467	950	0.0826
0.0363	2.0492	1000	0.0812
0.0444	2.1516	1050	0.0800
0.0395	2.2541	1100	0.0790
0.0449	2.3566	1150	0.0771
0.042	2.4590	1200	0.0759
0.0611	2.5615	1250	0.0744
0.0431	2.6639	1300	0.0725
0.0411	2.7664	1350	0.0711
0.0411	2.8689	1400	0.0700
0.0353	2.9713	1450	0.0695

Framework versions

PEFT 0.15.2
Transformers 4.52.2
Pytorch 2.6.0+cu124
Datasets 3.6.0
Tokenizers 0.21.1

ikerion
/

mistral-7b-magyar-portas-final

mistral-7b-magyar-portas-final

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for ikerion/mistral-7b-magyar-portas-final

Evaluation results