xlm-roberta-large-bs-16-lr-5e-05-ep-1-wp-0.1-gacc-8-gnm-1.0-FP16-mx-512-v0.1
This model is a fine-tuned version of FacebookAI/xlm-roberta-large on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 2.3291
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 128
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 1
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
17.8101 | 0.0055 | 50 | 5.0950 |
17.0019 | 0.0109 | 100 | 4.5201 |
15.6548 | 0.0164 | 150 | 4.3384 |
15.1777 | 0.0219 | 200 | 4.1739 |
15.6084 | 0.0273 | 250 | nan |
14.3951 | 0.0328 | 300 | 3.8169 |
14.184 | 0.0382 | 350 | 3.6496 |
14.157 | 0.0437 | 400 | 3.6370 |
13.78 | 0.0492 | 450 | 3.5758 |
13.7744 | 0.0546 | 500 | 3.4703 |
13.5207 | 0.0601 | 550 | 3.7431 |
12.8285 | 0.0656 | 600 | 3.4274 |
13.804 | 0.0710 | 650 | 3.2934 |
13.1016 | 0.0765 | 700 | 3.2929 |
12.9295 | 0.0819 | 750 | 3.2609 |
12.9975 | 0.0874 | 800 | 3.2154 |
12.8213 | 0.0929 | 850 | 3.1314 |
12.8268 | 0.0983 | 900 | 3.2730 |
12.9578 | 0.1038 | 950 | 3.1037 |
13.3053 | 0.1093 | 1000 | 3.1316 |
12.7619 | 0.1147 | 1050 | 3.0334 |
12.7588 | 0.1202 | 1100 | 3.0158 |
13.0244 | 0.1256 | 1150 | nan |
12.5915 | 0.1311 | 1200 | 3.0654 |
11.9712 | 0.1366 | 1250 | 2.9577 |
12.3582 | 0.1420 | 1300 | 2.9230 |
11.5631 | 0.1475 | 1350 | 2.9535 |
12.2369 | 0.1530 | 1400 | 3.0215 |
12.1179 | 0.1584 | 1450 | 2.8922 |
12.2686 | 0.1639 | 1500 | 2.8579 |
11.84 | 0.1694 | 1550 | 2.9253 |
11.6617 | 0.1748 | 1600 | 2.8686 |
11.6284 | 0.1803 | 1650 | 2.9694 |
11.8075 | 0.1857 | 1700 | 2.8212 |
11.5036 | 0.1912 | 1750 | 2.9165 |
11.5307 | 0.1967 | 1800 | 2.7684 |
11.7167 | 0.2021 | 1850 | nan |
12.3351 | 0.2076 | 1900 | 2.8803 |
11.8514 | 0.2131 | 1950 | 2.7556 |
11.9989 | 0.2185 | 2000 | 2.7531 |
12.0212 | 0.2240 | 2050 | 2.7811 |
11.5154 | 0.2294 | 2100 | 2.8039 |
11.8633 | 0.2349 | 2150 | 2.8538 |
11.5177 | 0.2404 | 2200 | 2.7256 |
11.5939 | 0.2458 | 2250 | 2.7699 |
11.6772 | 0.2513 | 2300 | 2.6950 |
11.2238 | 0.2568 | 2350 | 2.7304 |
11.1286 | 0.2622 | 2400 | 2.6525 |
11.7324 | 0.2677 | 2450 | 2.7490 |
10.8508 | 0.2731 | 2500 | 2.7722 |
10.8564 | 0.2786 | 2550 | 2.6763 |
11.4515 | 0.2841 | 2600 | 2.8086 |
11.0676 | 0.2895 | 2650 | 2.6937 |
11.21 | 0.2950 | 2700 | 2.7150 |
11.1875 | 0.3005 | 2750 | nan |
10.9272 | 0.3059 | 2800 | 2.7153 |
11.3898 | 0.3114 | 2850 | 2.7387 |
10.8959 | 0.3169 | 2900 | 2.7029 |
11.4243 | 0.3223 | 2950 | 2.6342 |
11.0173 | 0.3278 | 3000 | nan |
10.3994 | 0.3332 | 3050 | 2.4888 |
10.9072 | 0.3387 | 3100 | 2.6332 |
11.1628 | 0.3442 | 3150 | 2.6375 |
10.8527 | 0.3496 | 3200 | 2.5704 |
11.0833 | 0.3551 | 3250 | 2.6602 |
10.5689 | 0.3606 | 3300 | 2.5335 |
10.8759 | 0.3660 | 3350 | 2.5575 |
10.489 | 0.3715 | 3400 | 2.5462 |
10.7414 | 0.3769 | 3450 | 2.6745 |
10.8202 | 0.3824 | 3500 | nan |
10.7027 | 0.3879 | 3550 | 2.5444 |
11.4548 | 0.3933 | 3600 | 2.6391 |
10.4279 | 0.3988 | 3650 | 2.5813 |
10.726 | 0.4043 | 3700 | 2.5973 |
10.1897 | 0.4097 | 3750 | 2.5719 |
10.5646 | 0.4152 | 3800 | 2.6626 |
11.0231 | 0.4207 | 3850 | 2.5786 |
11.0557 | 0.4261 | 3900 | 2.6770 |
10.3466 | 0.4316 | 3950 | 2.5352 |
10.8437 | 0.4370 | 4000 | 2.7082 |
10.6587 | 0.4425 | 4050 | nan |
9.9986 | 0.4480 | 4100 | 2.5620 |
10.8423 | 0.4534 | 4150 | 2.5181 |
10.8241 | 0.4589 | 4200 | 2.5662 |
10.4568 | 0.4644 | 4250 | 2.5753 |
10.0628 | 0.4698 | 4300 | 2.5147 |
10.9293 | 0.4753 | 4350 | 2.5583 |
10.6637 | 0.4807 | 4400 | 2.4761 |
10.495 | 0.4862 | 4450 | 2.5910 |
10.2338 | 0.4917 | 4500 | 2.5183 |
10.4056 | 0.4971 | 4550 | 2.5513 |
10.373 | 0.5026 | 4600 | 2.4892 |
10.223 | 0.5081 | 4650 | nan |
10.7237 | 0.5135 | 4700 | 2.4571 |
10.473 | 0.5190 | 4750 | 2.5045 |
10.3394 | 0.5244 | 4800 | nan |
9.9574 | 0.5299 | 4850 | 2.4845 |
10.7453 | 0.5354 | 4900 | nan |
10.0733 | 0.5408 | 4950 | 2.5105 |
9.8847 | 0.5463 | 5000 | 2.5298 |
10.5273 | 0.5518 | 5050 | 2.5251 |
10.1006 | 0.5572 | 5100 | 2.5891 |
10.208 | 0.5627 | 5150 | 2.5482 |
9.9471 | 0.5682 | 5200 | 2.5731 |
10.2092 | 0.5736 | 5250 | 2.5134 |
9.8496 | 0.5791 | 5300 | 2.5534 |
10.1939 | 0.5845 | 5350 | 2.4982 |
10.1636 | 0.5900 | 5400 | 2.4370 |
9.962 | 0.5955 | 5450 | 2.4945 |
10.3635 | 0.6009 | 5500 | 2.5168 |
9.754 | 0.6064 | 5550 | 2.5053 |
10.2112 | 0.6119 | 5600 | 2.4416 |
9.9659 | 0.6173 | 5650 | 2.5780 |
9.6756 | 0.6228 | 5700 | 2.4121 |
9.9777 | 0.6282 | 5750 | 2.4450 |
9.9441 | 0.6337 | 5800 | 2.4634 |
10.4017 | 0.6392 | 5850 | 2.5407 |
10.0558 | 0.6446 | 5900 | 2.4228 |
9.7832 | 0.6501 | 5950 | 2.4340 |
9.9771 | 0.6556 | 6000 | 2.4906 |
9.4138 | 0.6610 | 6050 | 2.5171 |
10.2916 | 0.6665 | 6100 | 2.4348 |
9.8759 | 0.6719 | 6150 | 2.3867 |
9.9418 | 0.6774 | 6200 | 2.3981 |
9.6188 | 0.6829 | 6250 | 2.4660 |
9.8974 | 0.6883 | 6300 | 2.4299 |
10.0928 | 0.6938 | 6350 | 2.4024 |
9.9564 | 0.6993 | 6400 | 2.4812 |
9.7911 | 0.7047 | 6450 | 2.3437 |
10.3234 | 0.7102 | 6500 | 2.4240 |
9.8974 | 0.7157 | 6550 | 2.5699 |
9.2776 | 0.7211 | 6600 | 2.4354 |
9.7232 | 0.7266 | 6650 | 2.3804 |
10.05 | 0.7320 | 6700 | 2.4174 |
9.6149 | 0.7375 | 6750 | 2.4039 |
10.0379 | 0.7430 | 6800 | 2.5200 |
10.1982 | 0.7484 | 6850 | 2.4522 |
10.0545 | 0.7539 | 6900 | 2.4185 |
9.5577 | 0.7594 | 6950 | nan |
10.6035 | 0.7648 | 7000 | 2.3955 |
9.7875 | 0.7703 | 7050 | nan |
9.8262 | 0.7757 | 7100 | 2.4640 |
9.4249 | 0.7812 | 7150 | 2.3711 |
9.573 | 0.7867 | 7200 | 2.3369 |
9.5382 | 0.7921 | 7250 | 2.4253 |
9.4487 | 0.7976 | 7300 | 2.3971 |
9.6848 | 0.8031 | 7350 | 2.5155 |
9.1989 | 0.8085 | 7400 | nan |
9.1517 | 0.8140 | 7450 | 2.4483 |
10.0034 | 0.8194 | 7500 | 2.4458 |
9.2463 | 0.8249 | 7550 | 2.4025 |
9.8742 | 0.8304 | 7600 | 2.4496 |
9.8066 | 0.8358 | 7650 | 2.4838 |
9.2467 | 0.8413 | 7700 | 2.3789 |
9.6915 | 0.8468 | 7750 | 2.4223 |
9.9683 | 0.8522 | 7800 | 2.3724 |
9.5033 | 0.8577 | 7850 | 2.2997 |
9.4444 | 0.8632 | 7900 | 2.3901 |
9.5059 | 0.8686 | 7950 | 2.3708 |
9.513 | 0.8741 | 8000 | 2.3695 |
9.3093 | 0.8795 | 8050 | 2.4197 |
9.2414 | 0.8850 | 8100 | 2.4257 |
9.0852 | 0.8905 | 8150 | 2.3838 |
9.7345 | 0.8959 | 8200 | 2.4002 |
9.2903 | 0.9014 | 8250 | 2.3707 |
9.5652 | 0.9069 | 8300 | 2.3025 |
9.2533 | 0.9123 | 8350 | 2.3738 |
9.5378 | 0.9178 | 8400 | 2.4080 |
9.4812 | 0.9232 | 8450 | 2.4775 |
9.6664 | 0.9287 | 8500 | 2.3231 |
9.9709 | 0.9342 | 8550 | 2.3560 |
9.5003 | 0.9396 | 8600 | 2.3892 |
9.183 | 0.9451 | 8650 | 2.3027 |
9.4163 | 0.9506 | 8700 | 2.4888 |
10.2318 | 0.9560 | 8750 | 2.2755 |
9.6414 | 0.9615 | 8800 | 2.2422 |
9.2835 | 0.9669 | 8850 | 2.4216 |
9.5811 | 0.9724 | 8900 | 2.3790 |
9.0775 | 0.9779 | 8950 | 2.2990 |
9.3801 | 0.9833 | 9000 | 2.3856 |
9.5136 | 0.9888 | 9050 | nan |
9.6601 | 0.9943 | 9100 | 2.3805 |
9.7597 | 0.9997 | 9150 | 2.3291 |
Framework versions
- Transformers 4.47.1
- Pytorch 2.5.1+cu124
- Datasets 3.1.0
- Tokenizers 0.21.0
- Downloads last month
- 3
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for BounharAbdelaziz/xlm-roberta-large-bs-16-lr-5e-05-ep-1-wp-0.1-gacc-8-gnm-1.0-FP16-mx-512-v0.1
Base model
FacebookAI/xlm-roberta-large