vi-modernbert-ViHSD-ep20

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 16
eval_batch_size: 16
seed: 42
distributed_type: multi-GPU
gradient_accumulation_steps: 4
total_train_batch_size: 64
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.01
num_epochs: 20.0

Training Loss	Epoch	Step	Validation Loss	Micro F1	Micro Precision	Micro Recall	Macro F1	Macro Precision	Macro Recall
1.2008	0.9980	375	0.3767	86.7141	86.7141	86.7141	59.5450	76.5138	54.2867
1.194	1.9980	750	0.3603	87.3503	87.3503	87.3503	68.2151	70.6976	66.1582
0.5185	2.9980	1125	0.5565	86.0030	86.0030	86.0030	67.9724	66.8065	69.3709
0.3474	3.9980	1500	0.6241	85.5539	85.5539	85.5539	65.7528	66.9509	66.2314
0.2354	4.9980	1875	0.6098	86.7889	86.7889	86.7889	64.2029	67.9233	61.4523
0.2857	5.9980	2250	0.8669	86.7889	86.7889	86.7889	67.2326	68.1950	67.1750
0.1326	6.9980	2625	0.7455	87.1257	87.1257	87.1257	68.0176	69.7401	67.3727
0.4085	7.9980	3000	0.9578	88.0240	88.0240	88.0240	67.6255	73.9917	64.5661
0.0273	8.9980	3375	1.5414	87.1257	87.1257	87.1257	64.8080	70.6655	61.2109
0.036	9.9980	3750	1.1192	87.5374	87.5374	87.5374	67.8021	71.5825	65.9373
0.0748	10.9980	4125	1.2999	87.3503	87.3503	87.3503	67.6266	70.4011	66.5444
0.0644	11.9980	4500	1.4459	87.5374	87.5374	87.5374	67.6215	71.4430	65.1796
0.0201	12.9980	4875	1.5466	87.6497	87.6497	87.6497	67.8941	71.5368	65.5838
0.01	13.9980	5250	1.5540	87.3877	87.3877	87.3877	68.4412	70.3740	67.0800
0.0439	14.9980	5625	1.5876	87.5749	87.5749	87.5749	68.6453	70.7319	67.0817
0.0628	15.9980	6000	1.6211	87.4626	87.4626	87.4626	67.8192	70.7271	65.8999
0.0	16.9980	6375	1.6364	87.5749	87.5749	87.5749	68.5724	70.7215	66.9734
0.0431	17.9980	6750	1.6461	87.5	87.5	87.5	68.1004	70.4974	66.3750
0.0094	18.9980	7125	1.6505	87.4626	87.4626	87.4626	68.0533	70.4170	66.3597
0.0174	19.9980	7500	1.6519	87.5	87.5	87.5	68.1809	70.5794	66.4832