v2_articles_single_large

This model is a fine-tuned version of xlm-roberta-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.6526
  • Accuracy: 0.3857
  • F1: 0.4087

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 80
  • eval_batch_size: 80
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 160
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 35

Training results

Training Loss Epoch Step Validation Loss Accuracy F1
9.8049 0.2548 500 9.7696 0.0063 0.0004
9.5184 0.5097 1000 9.4548 0.0066 0.0002
9.0248 0.7645 1500 8.9444 0.0129 0.0025
8.5347 1.0194 2000 8.4306 0.0376 0.0116
8.0234 1.2742 2500 7.9427 0.0627 0.0229
7.639 1.5291 3000 7.4403 0.1047 0.0508
7.1271 1.7839 3500 6.9189 0.1357 0.0742
6.5748 2.0387 4000 6.3963 0.1605 0.0913
6.0621 2.2936 4500 5.8880 0.1784 0.1095
5.619 2.5484 5000 5.4470 0.1974 0.1264
5.2332 2.8033 5500 5.0557 0.2173 0.1512
4.7992 3.0581 6000 4.7030 0.2367 0.1737
4.5462 3.3129 6500 4.3994 0.2553 0.1979
4.2021 3.5678 7000 4.1254 0.2764 0.2226
3.9076 3.8226 7500 3.9074 0.2927 0.2426
3.7324 4.0775 8000 3.7108 0.3038 0.2575
3.4882 4.3323 8500 3.5696 0.3128 0.2731
3.3832 4.5872 9000 3.4306 0.3258 0.2932
3.2845 4.8420 9500 3.3197 0.3325 0.3035
3.035 5.0968 10000 3.2309 0.3369 0.3098
2.9903 5.3517 10500 3.1371 0.3440 0.3290
2.8294 5.6065 11000 3.0603 0.3517 0.3358
2.8602 5.8614 11500 2.9908 0.3558 0.3439
2.6384 6.1162 12000 2.9477 0.3607 0.3529
2.6094 6.3710 12500 2.8816 0.3653 0.3639
2.5143 6.6259 13000 2.8460 0.3718 0.3712
2.551 6.8807 13500 2.8101 0.3685 0.3733
2.2979 7.1356 14000 2.7735 0.3740 0.3804
2.3091 7.3904 14500 2.7315 0.3786 0.3892
2.239 7.6453 15000 2.6950 0.3812 0.3963
2.2109 7.9001 15500 2.6699 0.3818 0.4008
2.0498 8.1549 16000 2.6526 0.3857 0.4087
2.0797 8.4098 16500 2.6227 0.3902 0.4109
2.1027 8.6646 17000 2.5972 0.3873 0.4138
2.0108 8.9195 17500 2.5755 0.3934 0.4209
1.8812 9.1743 18000 2.5651 0.3935 0.4254
1.8961 9.4292 18500 2.5421 0.3998 0.4298
1.878 9.6840 19000 2.5359 0.4018 0.4352
1.8077 9.9388 19500 2.5115 0.4003 0.4362
1.7137 10.1937 20000 2.5032 0.3987 0.4385
1.71 10.4485 20500 2.4862 0.3995 0.4433
1.6946 10.7034 21000 2.4861 0.4002 0.4449
1.6815 10.9582 21500 2.4621 0.4073 0.4506
1.5642 11.2130 22000 2.4694 0.4061 0.4497
1.5588 11.4679 22500 2.4468 0.4085 0.4562
1.5367 11.7227 23000 2.4279 0.4110 0.4606
1.5718 11.9776 23500 2.4248 0.4106 0.4611
1.4507 12.2324 24000 2.4332 0.4124 0.4631
1.4353 12.4873 24500 2.4275 0.4121 0.4629
1.4319 12.7421 25000 2.4112 0.4156 0.4667
1.4224 12.9969 25500 2.4023 0.4132 0.4669
1.334 13.2518 26000 2.4074 0.4167 0.4729
1.32 13.5066 26500 2.4021 0.4149 0.4692
1.3201 13.7615 27000 2.3925 0.4172 0.4724
1.2608 14.0163 27500 2.3923 0.4230 0.4781
1.2215 14.2712 28000 2.4127 0.4146 0.4729
1.2394 14.5260 28500 2.3934 0.4227 0.4798
1.2167 14.7808 29000 2.3933 0.4216 0.4788

Framework versions

  • Transformers 4.51.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
7
Safetensors
Model size
579M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for MercuraTech/v2_articles_single_large

Finetuned
(404)
this model