mms-1b-allFT-Dahnon-ara

This model is a fine-tuned version of facebook/mms-1b-all on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.7716
  • Wer: 0.9348

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 2
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 100
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
9.9472 0.9814 33 10.1025 1.0079
9.9673 1.9814 66 10.0618 1.0090
9.8936 2.9814 99 9.9765 1.0157
9.7701 3.9814 132 9.8477 1.0202
9.6077 4.9814 165 9.6755 1.0337
9.4201 5.9814 198 9.4574 1.0472
9.1832 6.9814 231 9.1933 1.0664
8.8801 7.9814 264 8.8763 1.0787
8.5414 8.9814 297 8.5257 1.0877
8.1854 9.9814 330 8.1284 1.0709
7.763 10.9814 363 7.6892 1.0529
7.2828 11.9814 396 7.1843 1.0304
6.7643 12.9814 429 6.6300 1.0124
6.2062 13.9814 462 6.0169 1.0022
5.5846 14.9814 495 5.3558 1.0
4.92 15.9814 528 4.6667 1.0
4.3568 16.9814 561 4.1007 1.0
3.9133 17.9814 594 3.7354 1.0
3.6595 18.9814 627 3.5499 1.0
3.532 19.9814 660 3.4643 1.0
3.4538 20.9814 693 3.4138 1.0
3.4068 21.9814 726 3.3794 1.0
3.3777 22.9814 759 3.3551 1.0
3.3522 23.9814 792 3.3338 1.0
3.3334 24.9814 825 3.3149 1.0
3.313 25.9814 858 3.2934 1.0
3.2863 26.9814 891 3.2699 1.0
3.2705 27.9814 924 3.2419 1.0
3.239 28.9814 957 3.2087 1.0
3.2112 29.9814 990 3.1722 0.9989
3.1728 30.9814 1023 3.1235 0.9966
3.1277 31.9814 1056 3.0639 0.9966
3.0739 32.9814 1089 2.9944 0.9978
3.0063 33.9814 1122 2.9176 0.9978
2.9492 34.9814 1155 2.8354 0.9978
2.8745 35.9814 1188 2.7467 0.9989
2.8018 36.9814 1221 2.6584 0.9978
2.7188 37.9814 1254 2.5763 0.9989
2.6582 38.9814 1287 2.4999 0.9933
2.5816 39.9814 1320 2.4275 0.9910
2.5207 40.9814 1353 2.3672 0.9831
2.4655 41.9814 1386 2.3115 0.9831
2.4185 42.9814 1419 2.2616 0.9764
2.383 43.9814 1452 2.2175 0.9764
2.3566 44.9814 1485 2.1813 0.9719
2.3175 45.9814 1518 2.1477 0.9685
2.2717 46.9814 1551 2.1162 0.9663
2.2673 47.9814 1584 2.0888 0.9629
2.2369 48.9814 1617 2.0654 0.9606
2.2121 49.9814 1650 2.0425 0.9606
2.197 50.9814 1683 2.0228 0.9573
2.1756 51.9814 1716 2.0032 0.9550
2.1686 52.9814 1749 1.9858 0.9561
2.1429 53.9814 1782 1.9690 0.9516
2.1337 54.9814 1815 1.9542 0.9516
2.1211 55.9814 1848 1.9408 0.9505
2.1063 56.9814 1881 1.9288 0.9483
2.092 57.9814 1914 1.9173 0.9471
2.0887 58.9814 1947 1.9069 0.9471
2.0763 59.9814 1980 1.8977 0.9438
2.0762 60.9814 2013 1.8890 0.9460
2.0668 61.9814 2046 1.8809 0.9415
2.0594 62.9814 2079 1.8732 0.9404
2.0534 63.9814 2112 1.8659 0.9426
2.0405 64.9814 2145 1.8594 0.9426
2.0395 65.9814 2178 1.8530 0.9393
2.0177 66.9814 2211 1.8471 0.9449
2.033 67.9814 2244 1.8416 0.9404
2.0289 68.9814 2277 1.8361 0.9393
2.0164 69.9814 2310 1.8312 0.9404
2.0012 70.9814 2343 1.8253 0.9393
1.9994 71.9814 2376 1.8225 0.9393
2.0064 72.9814 2409 1.8193 0.9404
2.0064 73.9814 2442 1.8147 0.9404
1.9899 74.9814 2475 1.8115 0.9404
1.9993 75.9814 2508 1.8082 0.9381
1.9953 76.9814 2541 1.8052 0.9393
1.99 77.9814 2574 1.8025 0.9393
1.9757 78.9814 2607 1.7995 0.9393
1.9755 79.9814 2640 1.7962 0.9404
1.9813 80.9814 2673 1.7936 0.9393
1.9851 81.9814 2706 1.7917 0.9393
1.975 82.9814 2739 1.7895 0.9404
1.9693 83.9814 2772 1.7875 0.9370
1.9684 84.9814 2805 1.7854 0.9370
1.9619 85.9814 2838 1.7836 0.9370
1.963 86.9814 2871 1.7819 0.9359
1.9678 87.9814 2904 1.7804 0.9359
1.9611 88.9814 2937 1.7791 0.9359
1.9659 89.9814 2970 1.7778 0.9359
1.9627 90.9814 3003 1.7767 0.9348
1.9517 91.9814 3036 1.7758 0.9348
1.9563 92.9814 3069 1.7749 0.9348
1.9574 93.9814 3102 1.7739 0.9348
1.9665 94.9814 3135 1.7733 0.9348
1.9416 95.9814 3168 1.7728 0.9336
1.9613 96.9814 3201 1.7723 0.9336
1.955 97.9814 3234 1.7720 0.9336
1.9579 98.9814 3267 1.7720 0.9336
1.9555 99.9814 3300 1.7716 0.9348

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.4.1
  • Datasets 2.16.1
  • Tokenizers 0.21.1
Downloads last month
14
Safetensors
Model size
965M params
Tensor type
FP16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for sqrk/mms-1b-allFT-Dahnon-ara

Finetuned
(291)
this model