v2_articles_single_large

This model is a fine-tuned version of xlm-roberta-large on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 2.6526
Accuracy: 0.3857
F1: 0.4087

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 80
eval_batch_size: 80
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 160
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 35

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
9.8049	0.2548	500	9.7696	0.0063	0.0004
9.5184	0.5097	1000	9.4548	0.0066	0.0002
9.0248	0.7645	1500	8.9444	0.0129	0.0025
8.5347	1.0194	2000	8.4306	0.0376	0.0116
8.0234	1.2742	2500	7.9427	0.0627	0.0229
7.639	1.5291	3000	7.4403	0.1047	0.0508
7.1271	1.7839	3500	6.9189	0.1357	0.0742
6.5748	2.0387	4000	6.3963	0.1605	0.0913
6.0621	2.2936	4500	5.8880	0.1784	0.1095
5.619	2.5484	5000	5.4470	0.1974	0.1264
5.2332	2.8033	5500	5.0557	0.2173	0.1512
4.7992	3.0581	6000	4.7030	0.2367	0.1737
4.5462	3.3129	6500	4.3994	0.2553	0.1979
4.2021	3.5678	7000	4.1254	0.2764	0.2226
3.9076	3.8226	7500	3.9074	0.2927	0.2426
3.7324	4.0775	8000	3.7108	0.3038	0.2575
3.4882	4.3323	8500	3.5696	0.3128	0.2731
3.3832	4.5872	9000	3.4306	0.3258	0.2932
3.2845	4.8420	9500	3.3197	0.3325	0.3035
3.035	5.0968	10000	3.2309	0.3369	0.3098
2.9903	5.3517	10500	3.1371	0.3440	0.3290
2.8294	5.6065	11000	3.0603	0.3517	0.3358
2.8602	5.8614	11500	2.9908	0.3558	0.3439
2.6384	6.1162	12000	2.9477	0.3607	0.3529
2.6094	6.3710	12500	2.8816	0.3653	0.3639
2.5143	6.6259	13000	2.8460	0.3718	0.3712
2.551	6.8807	13500	2.8101	0.3685	0.3733
2.2979	7.1356	14000	2.7735	0.3740	0.3804
2.3091	7.3904	14500	2.7315	0.3786	0.3892
2.239	7.6453	15000	2.6950	0.3812	0.3963
2.2109	7.9001	15500	2.6699	0.3818	0.4008
2.0498	8.1549	16000	2.6526	0.3857	0.4087
2.0797	8.4098	16500	2.6227	0.3902	0.4109
2.1027	8.6646	17000	2.5972	0.3873	0.4138
2.0108	8.9195	17500	2.5755	0.3934	0.4209
1.8812	9.1743	18000	2.5651	0.3935	0.4254
1.8961	9.4292	18500	2.5421	0.3998	0.4298
1.878	9.6840	19000	2.5359	0.4018	0.4352
1.8077	9.9388	19500	2.5115	0.4003	0.4362
1.7137	10.1937	20000	2.5032	0.3987	0.4385
1.71	10.4485	20500	2.4862	0.3995	0.4433
1.6946	10.7034	21000	2.4861	0.4002	0.4449
1.6815	10.9582	21500	2.4621	0.4073	0.4506
1.5642	11.2130	22000	2.4694	0.4061	0.4497
1.5588	11.4679	22500	2.4468	0.4085	0.4562
1.5367	11.7227	23000	2.4279	0.4110	0.4606
1.5718	11.9776	23500	2.4248	0.4106	0.4611
1.4507	12.2324	24000	2.4332	0.4124	0.4631
1.4353	12.4873	24500	2.4275	0.4121	0.4629
1.4319	12.7421	25000	2.4112	0.4156	0.4667
1.4224	12.9969	25500	2.4023	0.4132	0.4669
1.334	13.2518	26000	2.4074	0.4167	0.4729
1.32	13.5066	26500	2.4021	0.4149	0.4692
1.3201	13.7615	27000	2.3925	0.4172	0.4724
1.2608	14.0163	27500	2.3923	0.4230	0.4781
1.2215	14.2712	28000	2.4127	0.4146	0.4729
1.2394	14.5260	28500	2.3934	0.4227	0.4798
1.2167	14.7808	29000	2.3933	0.4216	0.4788

Framework versions

Transformers 4.51.0
Pytorch 2.6.0+cu124
Datasets 3.5.0
Tokenizers 0.21.1

MercuraTech
/

v2_articles_single_large

v2_articles_single_large

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for MercuraTech/v2_articles_single_large

Evaluation results