mms-1b-allFT-Dahnon-ara
This model is a fine-tuned version of facebook/mms-1b-all on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.7716
- Wer: 0.9348
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 2
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- gradient_accumulation_steps: 8
- total_train_batch_size: 16
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 100
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Wer |
---|---|---|---|---|
9.9472 | 0.9814 | 33 | 10.1025 | 1.0079 |
9.9673 | 1.9814 | 66 | 10.0618 | 1.0090 |
9.8936 | 2.9814 | 99 | 9.9765 | 1.0157 |
9.7701 | 3.9814 | 132 | 9.8477 | 1.0202 |
9.6077 | 4.9814 | 165 | 9.6755 | 1.0337 |
9.4201 | 5.9814 | 198 | 9.4574 | 1.0472 |
9.1832 | 6.9814 | 231 | 9.1933 | 1.0664 |
8.8801 | 7.9814 | 264 | 8.8763 | 1.0787 |
8.5414 | 8.9814 | 297 | 8.5257 | 1.0877 |
8.1854 | 9.9814 | 330 | 8.1284 | 1.0709 |
7.763 | 10.9814 | 363 | 7.6892 | 1.0529 |
7.2828 | 11.9814 | 396 | 7.1843 | 1.0304 |
6.7643 | 12.9814 | 429 | 6.6300 | 1.0124 |
6.2062 | 13.9814 | 462 | 6.0169 | 1.0022 |
5.5846 | 14.9814 | 495 | 5.3558 | 1.0 |
4.92 | 15.9814 | 528 | 4.6667 | 1.0 |
4.3568 | 16.9814 | 561 | 4.1007 | 1.0 |
3.9133 | 17.9814 | 594 | 3.7354 | 1.0 |
3.6595 | 18.9814 | 627 | 3.5499 | 1.0 |
3.532 | 19.9814 | 660 | 3.4643 | 1.0 |
3.4538 | 20.9814 | 693 | 3.4138 | 1.0 |
3.4068 | 21.9814 | 726 | 3.3794 | 1.0 |
3.3777 | 22.9814 | 759 | 3.3551 | 1.0 |
3.3522 | 23.9814 | 792 | 3.3338 | 1.0 |
3.3334 | 24.9814 | 825 | 3.3149 | 1.0 |
3.313 | 25.9814 | 858 | 3.2934 | 1.0 |
3.2863 | 26.9814 | 891 | 3.2699 | 1.0 |
3.2705 | 27.9814 | 924 | 3.2419 | 1.0 |
3.239 | 28.9814 | 957 | 3.2087 | 1.0 |
3.2112 | 29.9814 | 990 | 3.1722 | 0.9989 |
3.1728 | 30.9814 | 1023 | 3.1235 | 0.9966 |
3.1277 | 31.9814 | 1056 | 3.0639 | 0.9966 |
3.0739 | 32.9814 | 1089 | 2.9944 | 0.9978 |
3.0063 | 33.9814 | 1122 | 2.9176 | 0.9978 |
2.9492 | 34.9814 | 1155 | 2.8354 | 0.9978 |
2.8745 | 35.9814 | 1188 | 2.7467 | 0.9989 |
2.8018 | 36.9814 | 1221 | 2.6584 | 0.9978 |
2.7188 | 37.9814 | 1254 | 2.5763 | 0.9989 |
2.6582 | 38.9814 | 1287 | 2.4999 | 0.9933 |
2.5816 | 39.9814 | 1320 | 2.4275 | 0.9910 |
2.5207 | 40.9814 | 1353 | 2.3672 | 0.9831 |
2.4655 | 41.9814 | 1386 | 2.3115 | 0.9831 |
2.4185 | 42.9814 | 1419 | 2.2616 | 0.9764 |
2.383 | 43.9814 | 1452 | 2.2175 | 0.9764 |
2.3566 | 44.9814 | 1485 | 2.1813 | 0.9719 |
2.3175 | 45.9814 | 1518 | 2.1477 | 0.9685 |
2.2717 | 46.9814 | 1551 | 2.1162 | 0.9663 |
2.2673 | 47.9814 | 1584 | 2.0888 | 0.9629 |
2.2369 | 48.9814 | 1617 | 2.0654 | 0.9606 |
2.2121 | 49.9814 | 1650 | 2.0425 | 0.9606 |
2.197 | 50.9814 | 1683 | 2.0228 | 0.9573 |
2.1756 | 51.9814 | 1716 | 2.0032 | 0.9550 |
2.1686 | 52.9814 | 1749 | 1.9858 | 0.9561 |
2.1429 | 53.9814 | 1782 | 1.9690 | 0.9516 |
2.1337 | 54.9814 | 1815 | 1.9542 | 0.9516 |
2.1211 | 55.9814 | 1848 | 1.9408 | 0.9505 |
2.1063 | 56.9814 | 1881 | 1.9288 | 0.9483 |
2.092 | 57.9814 | 1914 | 1.9173 | 0.9471 |
2.0887 | 58.9814 | 1947 | 1.9069 | 0.9471 |
2.0763 | 59.9814 | 1980 | 1.8977 | 0.9438 |
2.0762 | 60.9814 | 2013 | 1.8890 | 0.9460 |
2.0668 | 61.9814 | 2046 | 1.8809 | 0.9415 |
2.0594 | 62.9814 | 2079 | 1.8732 | 0.9404 |
2.0534 | 63.9814 | 2112 | 1.8659 | 0.9426 |
2.0405 | 64.9814 | 2145 | 1.8594 | 0.9426 |
2.0395 | 65.9814 | 2178 | 1.8530 | 0.9393 |
2.0177 | 66.9814 | 2211 | 1.8471 | 0.9449 |
2.033 | 67.9814 | 2244 | 1.8416 | 0.9404 |
2.0289 | 68.9814 | 2277 | 1.8361 | 0.9393 |
2.0164 | 69.9814 | 2310 | 1.8312 | 0.9404 |
2.0012 | 70.9814 | 2343 | 1.8253 | 0.9393 |
1.9994 | 71.9814 | 2376 | 1.8225 | 0.9393 |
2.0064 | 72.9814 | 2409 | 1.8193 | 0.9404 |
2.0064 | 73.9814 | 2442 | 1.8147 | 0.9404 |
1.9899 | 74.9814 | 2475 | 1.8115 | 0.9404 |
1.9993 | 75.9814 | 2508 | 1.8082 | 0.9381 |
1.9953 | 76.9814 | 2541 | 1.8052 | 0.9393 |
1.99 | 77.9814 | 2574 | 1.8025 | 0.9393 |
1.9757 | 78.9814 | 2607 | 1.7995 | 0.9393 |
1.9755 | 79.9814 | 2640 | 1.7962 | 0.9404 |
1.9813 | 80.9814 | 2673 | 1.7936 | 0.9393 |
1.9851 | 81.9814 | 2706 | 1.7917 | 0.9393 |
1.975 | 82.9814 | 2739 | 1.7895 | 0.9404 |
1.9693 | 83.9814 | 2772 | 1.7875 | 0.9370 |
1.9684 | 84.9814 | 2805 | 1.7854 | 0.9370 |
1.9619 | 85.9814 | 2838 | 1.7836 | 0.9370 |
1.963 | 86.9814 | 2871 | 1.7819 | 0.9359 |
1.9678 | 87.9814 | 2904 | 1.7804 | 0.9359 |
1.9611 | 88.9814 | 2937 | 1.7791 | 0.9359 |
1.9659 | 89.9814 | 2970 | 1.7778 | 0.9359 |
1.9627 | 90.9814 | 3003 | 1.7767 | 0.9348 |
1.9517 | 91.9814 | 3036 | 1.7758 | 0.9348 |
1.9563 | 92.9814 | 3069 | 1.7749 | 0.9348 |
1.9574 | 93.9814 | 3102 | 1.7739 | 0.9348 |
1.9665 | 94.9814 | 3135 | 1.7733 | 0.9348 |
1.9416 | 95.9814 | 3168 | 1.7728 | 0.9336 |
1.9613 | 96.9814 | 3201 | 1.7723 | 0.9336 |
1.955 | 97.9814 | 3234 | 1.7720 | 0.9336 |
1.9579 | 98.9814 | 3267 | 1.7720 | 0.9336 |
1.9555 | 99.9814 | 3300 | 1.7716 | 0.9348 |
Framework versions
- Transformers 4.51.3
- Pytorch 2.4.1
- Datasets 2.16.1
- Tokenizers 0.21.1
- Downloads last month
- 14
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for sqrk/mms-1b-allFT-Dahnon-ara
Base model
facebook/mms-1b-all