mms-1b-allFTwPTtok-Dahnon-ara

This model is a fine-tuned version of facebook/mms-1b-all on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6487
  • Wer: 0.9247

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 2
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 100
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
4.5662 0.9481 16 4.6001 0.9685
4.5309 1.9481 32 4.5929 0.9685
4.5329 2.9481 48 4.6326 0.9685
4.5317 3.9481 64 4.5605 0.9685
4.5405 4.9481 80 4.5627 0.9697
4.4934 5.9481 96 4.5501 0.9697
4.5128 6.9481 112 4.5409 0.9674
4.4423 7.9481 128 4.4716 0.9663
4.4079 8.9481 144 4.4084 0.9663
4.3634 9.9481 160 4.3930 0.9652
4.339 10.9481 176 4.3668 0.9663
4.3278 11.9481 192 4.2998 0.9663
4.2442 12.9481 208 4.2491 0.9663
4.2058 13.9481 224 4.1759 0.9640
4.1347 14.9481 240 4.1531 0.9607
4.0858 15.9481 256 4.0988 0.9618
4.057 16.9481 272 4.0238 0.9618
3.9754 17.9481 288 3.9874 0.9618
3.9022 18.9481 304 3.8289 0.9596
3.8378 19.9481 320 3.8411 0.9607
3.7652 20.9481 336 3.7493 0.9596
3.6806 21.9481 352 3.6354 0.9607
3.6253 22.9481 368 3.5473 0.9584
3.5504 23.9481 384 3.4948 0.9596
3.4614 24.9481 400 3.4174 0.9596
3.3813 25.9481 416 3.3505 0.9596
3.2972 26.9481 432 3.2648 0.9596
3.2295 27.9481 448 3.1194 0.9607
3.1405 28.9481 464 3.0986 0.9607
3.0944 29.9481 480 2.9525 0.9584
2.9966 30.9481 496 2.9033 0.9562
2.9363 31.9481 512 2.8352 0.9573
2.8846 32.9481 528 2.7396 0.9528
2.8162 33.9481 544 2.6792 0.9551
2.7796 34.9481 560 2.6296 0.9517
2.7357 35.9481 576 2.5748 0.9528
2.7042 36.9481 592 2.5272 0.9551
2.6645 37.9481 608 2.4788 0.9562
2.6233 38.9481 624 2.4618 0.9551
2.5881 39.9481 640 2.4325 0.9528
2.5649 40.9481 656 2.3929 0.9517
2.5445 41.9481 672 2.3417 0.9506
2.505 42.9481 688 2.3447 0.9506
2.492 43.9481 704 2.3333 0.9494
2.4481 44.9481 720 2.2963 0.9494
2.4464 45.9481 736 2.2922 0.9449
2.427 46.9481 752 2.2873 0.9472
2.3932 47.9481 768 2.2303 0.9472
2.3866 48.9481 784 2.2244 0.9472
2.3761 49.9481 800 2.2135 0.9494
2.3527 50.9481 816 2.1928 0.9483
2.3464 51.9481 832 2.1885 0.9506
2.3422 52.9481 848 2.1207 0.9517
2.3173 53.9481 864 2.1548 0.9528
2.3185 54.9481 880 2.1174 0.9528
2.301 55.9481 896 2.1264 0.9528
2.2752 56.9481 912 2.1109 0.9506
2.2851 57.9481 928 2.1136 0.9506
2.2845 58.9481 944 2.0836 0.9506
2.2667 59.9481 960 2.0947 0.9472
2.2432 60.9481 976 2.0727 0.9483
2.2628 61.9481 992 2.0672 0.9461
2.2456 62.9481 1008 2.0694 0.9449
2.232 63.9481 1024 2.0500 0.9427
2.232 64.9481 1040 2.0411 0.9438
2.2064 65.9481 1056 2.0616 0.9438
2.2101 66.9481 1072 2.0398 0.9416
2.2129 67.9481 1088 2.0300 0.9416
2.1904 68.9481 1104 2.0082 0.9427
2.1814 69.9481 1120 2.0284 0.9427
2.1841 70.9481 1136 2.0015 0.9404
2.1687 71.9481 1152 1.9901 0.9404
2.1813 72.9481 1168 2.0006 0.9404
2.1735 73.9481 1184 2.0057 0.9393
2.1572 74.9481 1200 1.9838 0.9393
2.1736 75.9481 1216 1.9880 0.9404
2.1574 76.9481 1232 1.9782 0.9416
2.1579 77.9481 1248 1.9740 0.9393
2.1474 78.9481 1264 1.9566 0.9393
2.1512 79.9481 1280 1.9681 0.9382
2.1495 80.9481 1296 1.9760 0.9382
2.1479 81.9481 1312 1.9753 0.9416
2.1407 82.9481 1328 1.9586 0.9393
2.141 83.9481 1344 1.9639 0.9360
2.125 84.9481 1360 1.9805 0.9371
2.1837 41.9814 1386 1.9705 0.9371
2.1218 42.9814 1419 1.9699 0.9371
2.1332 43.9814 1452 1.9677 0.9382
2.1175 44.9814 1485 1.9647 0.9393
2.1011 45.9814 1518 1.9609 0.9371
2.1104 46.9814 1551 1.9557 0.9382
2.1024 47.9814 1584 1.9496 0.9371
2.1062 48.9814 1617 1.9430 0.9371
2.0787 49.9814 1650 1.9361 0.9382
2.0998 50.9814 1683 1.9275 0.9382
2.0633 51.9814 1716 1.9188 0.9382
2.0607 52.9814 1749 1.9094 0.9382
2.0636 53.9814 1782 1.8984 0.9404
2.0398 54.9814 1815 1.8879 0.9404
2.0337 55.9814 1848 1.8776 0.9382
2.0319 56.9814 1881 1.8638 0.9404
2.0178 57.9814 1914 1.8525 0.9416
1.9918 58.9814 1947 1.8432 0.9382
1.9907 59.9814 1980 1.8330 0.9382
1.9902 60.9814 2013 1.8225 0.9393
1.9714 61.9814 2046 1.8133 0.9360
1.9571 62.9814 2079 1.8058 0.9382
1.9776 63.9814 2112 1.7982 0.9360
1.9541 64.9814 2145 1.7901 0.9348
1.9389 65.9814 2178 1.7824 0.9337
1.9376 66.9814 2211 1.7763 0.9348
1.9369 67.9814 2244 1.7684 0.9337
1.9197 68.9814 2277 1.7612 0.9348
1.9232 69.9814 2310 1.7532 0.9326
1.921 70.9814 2343 1.7483 0.9326
1.9156 71.9814 2376 1.7415 0.9292
1.9088 72.9814 2409 1.7363 0.9292
1.9004 73.9814 2442 1.7331 0.9236
1.884 74.9814 2475 1.7275 0.9258
1.8918 75.9814 2508 1.7218 0.9236
1.8822 76.9814 2541 1.7174 0.9225
1.88 77.9814 2574 1.7138 0.9202
1.8883 78.9814 2607 1.7089 0.9202
1.8822 79.9814 2640 1.7038 0.9213
1.8664 80.9814 2673 1.7015 0.9225
1.8698 81.9814 2706 1.6981 0.9225
1.8753 82.9814 2739 1.6946 0.9225
1.8693 83.9814 2772 1.6913 0.9225
1.8477 84.9814 2805 1.6882 0.9236
1.8528 85.9814 2838 1.6859 0.9236
1.8454 86.9814 2871 1.6827 0.9236
1.8329 87.9814 2904 1.6795 0.9258
1.8447 88.9814 2937 1.6763 0.9247
1.849 89.9814 2970 1.6730 0.9247
1.8313 90.9814 3003 1.6684 0.9225
1.8315 91.9814 3036 1.6667 0.9213
1.8253 92.9814 3069 1.6641 0.9225
1.8359 93.9814 3102 1.6623 0.9247
1.8202 94.9814 3135 1.6594 0.9236
1.8247 95.9814 3168 1.6567 0.9247
1.817 96.9814 3201 1.6541 0.9236
1.8102 97.9814 3234 1.6516 0.9247
1.8191 98.9814 3267 1.6502 0.9247
1.7903 99.9814 3300 1.6487 0.9247

Framework versions

  • Transformers 4.50.3
  • Pytorch 2.2.2+cu121
  • Datasets 3.0.0
  • Tokenizers 0.21.1
Downloads last month
8
Safetensors
Model size
965M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for sqrk/mms-1b-allFTwPTtok-Dahnon-ara

Finetuned
(291)
this model