mms-1b-allFTwPTtok-Dahnon-ara
This model is a fine-tuned version of facebook/mms-1b-all on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.6487
- Wer: 0.9247
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 2
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 16
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 100
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Wer |
---|---|---|---|---|
4.5662 | 0.9481 | 16 | 4.6001 | 0.9685 |
4.5309 | 1.9481 | 32 | 4.5929 | 0.9685 |
4.5329 | 2.9481 | 48 | 4.6326 | 0.9685 |
4.5317 | 3.9481 | 64 | 4.5605 | 0.9685 |
4.5405 | 4.9481 | 80 | 4.5627 | 0.9697 |
4.4934 | 5.9481 | 96 | 4.5501 | 0.9697 |
4.5128 | 6.9481 | 112 | 4.5409 | 0.9674 |
4.4423 | 7.9481 | 128 | 4.4716 | 0.9663 |
4.4079 | 8.9481 | 144 | 4.4084 | 0.9663 |
4.3634 | 9.9481 | 160 | 4.3930 | 0.9652 |
4.339 | 10.9481 | 176 | 4.3668 | 0.9663 |
4.3278 | 11.9481 | 192 | 4.2998 | 0.9663 |
4.2442 | 12.9481 | 208 | 4.2491 | 0.9663 |
4.2058 | 13.9481 | 224 | 4.1759 | 0.9640 |
4.1347 | 14.9481 | 240 | 4.1531 | 0.9607 |
4.0858 | 15.9481 | 256 | 4.0988 | 0.9618 |
4.057 | 16.9481 | 272 | 4.0238 | 0.9618 |
3.9754 | 17.9481 | 288 | 3.9874 | 0.9618 |
3.9022 | 18.9481 | 304 | 3.8289 | 0.9596 |
3.8378 | 19.9481 | 320 | 3.8411 | 0.9607 |
3.7652 | 20.9481 | 336 | 3.7493 | 0.9596 |
3.6806 | 21.9481 | 352 | 3.6354 | 0.9607 |
3.6253 | 22.9481 | 368 | 3.5473 | 0.9584 |
3.5504 | 23.9481 | 384 | 3.4948 | 0.9596 |
3.4614 | 24.9481 | 400 | 3.4174 | 0.9596 |
3.3813 | 25.9481 | 416 | 3.3505 | 0.9596 |
3.2972 | 26.9481 | 432 | 3.2648 | 0.9596 |
3.2295 | 27.9481 | 448 | 3.1194 | 0.9607 |
3.1405 | 28.9481 | 464 | 3.0986 | 0.9607 |
3.0944 | 29.9481 | 480 | 2.9525 | 0.9584 |
2.9966 | 30.9481 | 496 | 2.9033 | 0.9562 |
2.9363 | 31.9481 | 512 | 2.8352 | 0.9573 |
2.8846 | 32.9481 | 528 | 2.7396 | 0.9528 |
2.8162 | 33.9481 | 544 | 2.6792 | 0.9551 |
2.7796 | 34.9481 | 560 | 2.6296 | 0.9517 |
2.7357 | 35.9481 | 576 | 2.5748 | 0.9528 |
2.7042 | 36.9481 | 592 | 2.5272 | 0.9551 |
2.6645 | 37.9481 | 608 | 2.4788 | 0.9562 |
2.6233 | 38.9481 | 624 | 2.4618 | 0.9551 |
2.5881 | 39.9481 | 640 | 2.4325 | 0.9528 |
2.5649 | 40.9481 | 656 | 2.3929 | 0.9517 |
2.5445 | 41.9481 | 672 | 2.3417 | 0.9506 |
2.505 | 42.9481 | 688 | 2.3447 | 0.9506 |
2.492 | 43.9481 | 704 | 2.3333 | 0.9494 |
2.4481 | 44.9481 | 720 | 2.2963 | 0.9494 |
2.4464 | 45.9481 | 736 | 2.2922 | 0.9449 |
2.427 | 46.9481 | 752 | 2.2873 | 0.9472 |
2.3932 | 47.9481 | 768 | 2.2303 | 0.9472 |
2.3866 | 48.9481 | 784 | 2.2244 | 0.9472 |
2.3761 | 49.9481 | 800 | 2.2135 | 0.9494 |
2.3527 | 50.9481 | 816 | 2.1928 | 0.9483 |
2.3464 | 51.9481 | 832 | 2.1885 | 0.9506 |
2.3422 | 52.9481 | 848 | 2.1207 | 0.9517 |
2.3173 | 53.9481 | 864 | 2.1548 | 0.9528 |
2.3185 | 54.9481 | 880 | 2.1174 | 0.9528 |
2.301 | 55.9481 | 896 | 2.1264 | 0.9528 |
2.2752 | 56.9481 | 912 | 2.1109 | 0.9506 |
2.2851 | 57.9481 | 928 | 2.1136 | 0.9506 |
2.2845 | 58.9481 | 944 | 2.0836 | 0.9506 |
2.2667 | 59.9481 | 960 | 2.0947 | 0.9472 |
2.2432 | 60.9481 | 976 | 2.0727 | 0.9483 |
2.2628 | 61.9481 | 992 | 2.0672 | 0.9461 |
2.2456 | 62.9481 | 1008 | 2.0694 | 0.9449 |
2.232 | 63.9481 | 1024 | 2.0500 | 0.9427 |
2.232 | 64.9481 | 1040 | 2.0411 | 0.9438 |
2.2064 | 65.9481 | 1056 | 2.0616 | 0.9438 |
2.2101 | 66.9481 | 1072 | 2.0398 | 0.9416 |
2.2129 | 67.9481 | 1088 | 2.0300 | 0.9416 |
2.1904 | 68.9481 | 1104 | 2.0082 | 0.9427 |
2.1814 | 69.9481 | 1120 | 2.0284 | 0.9427 |
2.1841 | 70.9481 | 1136 | 2.0015 | 0.9404 |
2.1687 | 71.9481 | 1152 | 1.9901 | 0.9404 |
2.1813 | 72.9481 | 1168 | 2.0006 | 0.9404 |
2.1735 | 73.9481 | 1184 | 2.0057 | 0.9393 |
2.1572 | 74.9481 | 1200 | 1.9838 | 0.9393 |
2.1736 | 75.9481 | 1216 | 1.9880 | 0.9404 |
2.1574 | 76.9481 | 1232 | 1.9782 | 0.9416 |
2.1579 | 77.9481 | 1248 | 1.9740 | 0.9393 |
2.1474 | 78.9481 | 1264 | 1.9566 | 0.9393 |
2.1512 | 79.9481 | 1280 | 1.9681 | 0.9382 |
2.1495 | 80.9481 | 1296 | 1.9760 | 0.9382 |
2.1479 | 81.9481 | 1312 | 1.9753 | 0.9416 |
2.1407 | 82.9481 | 1328 | 1.9586 | 0.9393 |
2.141 | 83.9481 | 1344 | 1.9639 | 0.9360 |
2.125 | 84.9481 | 1360 | 1.9805 | 0.9371 |
2.1837 | 41.9814 | 1386 | 1.9705 | 0.9371 |
2.1218 | 42.9814 | 1419 | 1.9699 | 0.9371 |
2.1332 | 43.9814 | 1452 | 1.9677 | 0.9382 |
2.1175 | 44.9814 | 1485 | 1.9647 | 0.9393 |
2.1011 | 45.9814 | 1518 | 1.9609 | 0.9371 |
2.1104 | 46.9814 | 1551 | 1.9557 | 0.9382 |
2.1024 | 47.9814 | 1584 | 1.9496 | 0.9371 |
2.1062 | 48.9814 | 1617 | 1.9430 | 0.9371 |
2.0787 | 49.9814 | 1650 | 1.9361 | 0.9382 |
2.0998 | 50.9814 | 1683 | 1.9275 | 0.9382 |
2.0633 | 51.9814 | 1716 | 1.9188 | 0.9382 |
2.0607 | 52.9814 | 1749 | 1.9094 | 0.9382 |
2.0636 | 53.9814 | 1782 | 1.8984 | 0.9404 |
2.0398 | 54.9814 | 1815 | 1.8879 | 0.9404 |
2.0337 | 55.9814 | 1848 | 1.8776 | 0.9382 |
2.0319 | 56.9814 | 1881 | 1.8638 | 0.9404 |
2.0178 | 57.9814 | 1914 | 1.8525 | 0.9416 |
1.9918 | 58.9814 | 1947 | 1.8432 | 0.9382 |
1.9907 | 59.9814 | 1980 | 1.8330 | 0.9382 |
1.9902 | 60.9814 | 2013 | 1.8225 | 0.9393 |
1.9714 | 61.9814 | 2046 | 1.8133 | 0.9360 |
1.9571 | 62.9814 | 2079 | 1.8058 | 0.9382 |
1.9776 | 63.9814 | 2112 | 1.7982 | 0.9360 |
1.9541 | 64.9814 | 2145 | 1.7901 | 0.9348 |
1.9389 | 65.9814 | 2178 | 1.7824 | 0.9337 |
1.9376 | 66.9814 | 2211 | 1.7763 | 0.9348 |
1.9369 | 67.9814 | 2244 | 1.7684 | 0.9337 |
1.9197 | 68.9814 | 2277 | 1.7612 | 0.9348 |
1.9232 | 69.9814 | 2310 | 1.7532 | 0.9326 |
1.921 | 70.9814 | 2343 | 1.7483 | 0.9326 |
1.9156 | 71.9814 | 2376 | 1.7415 | 0.9292 |
1.9088 | 72.9814 | 2409 | 1.7363 | 0.9292 |
1.9004 | 73.9814 | 2442 | 1.7331 | 0.9236 |
1.884 | 74.9814 | 2475 | 1.7275 | 0.9258 |
1.8918 | 75.9814 | 2508 | 1.7218 | 0.9236 |
1.8822 | 76.9814 | 2541 | 1.7174 | 0.9225 |
1.88 | 77.9814 | 2574 | 1.7138 | 0.9202 |
1.8883 | 78.9814 | 2607 | 1.7089 | 0.9202 |
1.8822 | 79.9814 | 2640 | 1.7038 | 0.9213 |
1.8664 | 80.9814 | 2673 | 1.7015 | 0.9225 |
1.8698 | 81.9814 | 2706 | 1.6981 | 0.9225 |
1.8753 | 82.9814 | 2739 | 1.6946 | 0.9225 |
1.8693 | 83.9814 | 2772 | 1.6913 | 0.9225 |
1.8477 | 84.9814 | 2805 | 1.6882 | 0.9236 |
1.8528 | 85.9814 | 2838 | 1.6859 | 0.9236 |
1.8454 | 86.9814 | 2871 | 1.6827 | 0.9236 |
1.8329 | 87.9814 | 2904 | 1.6795 | 0.9258 |
1.8447 | 88.9814 | 2937 | 1.6763 | 0.9247 |
1.849 | 89.9814 | 2970 | 1.6730 | 0.9247 |
1.8313 | 90.9814 | 3003 | 1.6684 | 0.9225 |
1.8315 | 91.9814 | 3036 | 1.6667 | 0.9213 |
1.8253 | 92.9814 | 3069 | 1.6641 | 0.9225 |
1.8359 | 93.9814 | 3102 | 1.6623 | 0.9247 |
1.8202 | 94.9814 | 3135 | 1.6594 | 0.9236 |
1.8247 | 95.9814 | 3168 | 1.6567 | 0.9247 |
1.817 | 96.9814 | 3201 | 1.6541 | 0.9236 |
1.8102 | 97.9814 | 3234 | 1.6516 | 0.9247 |
1.8191 | 98.9814 | 3267 | 1.6502 | 0.9247 |
1.7903 | 99.9814 | 3300 | 1.6487 | 0.9247 |
Framework versions
- Transformers 4.50.3
- Pytorch 2.2.2+cu121
- Datasets 3.0.0
- Tokenizers 0.21.1
- Downloads last month
- 8
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for sqrk/mms-1b-allFTwPTtok-Dahnon-ara
Base model
facebook/mms-1b-all