wav2vec2-base-librispeech-model
This model is a fine-tuned version of facebook/wav2vec2-base-960h on the LIBRI10H - ENG dataset. It achieves the following results on the evaluation set:
- Loss: 0.8515
- Wer: 0.7226
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 16
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 200
- num_epochs: 100.0
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Wer |
---|---|---|---|---|
4.7426 | 1.1565 | 200 | 2.8968 | 1.0 |
2.7493 | 2.3130 | 400 | 2.2712 | 0.9987 |
2.0118 | 3.4696 | 600 | 1.6905 | 0.9768 |
1.7815 | 4.6261 | 800 | 1.5406 | 0.9588 |
1.667 | 5.7826 | 1000 | 1.4410 | 0.9385 |
1.5898 | 6.9391 | 1200 | 1.3799 | 0.9282 |
1.5366 | 8.0928 | 1400 | 1.3415 | 0.9165 |
1.4917 | 9.2493 | 1600 | 1.3144 | 0.9205 |
1.455 | 10.4058 | 1800 | 1.2746 | 0.9068 |
1.4266 | 11.5623 | 2000 | 1.2521 | 0.9102 |
1.3925 | 12.7188 | 2200 | 1.2213 | 0.8971 |
1.3754 | 13.8754 | 2400 | 1.2028 | 0.8938 |
1.3452 | 15.0290 | 2600 | 1.1931 | 0.8826 |
1.3265 | 16.1855 | 2800 | 1.1682 | 0.8860 |
1.3106 | 17.3420 | 3000 | 1.1645 | 0.8752 |
1.2917 | 18.4986 | 3200 | 1.1686 | 0.8780 |
1.2745 | 19.6551 | 3400 | 1.1385 | 0.8670 |
1.2639 | 20.8116 | 3600 | 1.1301 | 0.8666 |
1.2432 | 21.9681 | 3800 | 1.1173 | 0.8670 |
1.2294 | 23.1217 | 4000 | 1.1098 | 0.8620 |
1.2203 | 24.2783 | 4200 | 1.1077 | 0.8711 |
1.2037 | 25.4348 | 4400 | 1.0964 | 0.8626 |
1.1965 | 26.5913 | 4600 | 1.0910 | 0.8581 |
1.181 | 27.7478 | 4800 | 1.0842 | 0.8533 |
1.1711 | 28.9043 | 5000 | 1.0692 | 0.8465 |
1.1573 | 30.0580 | 5200 | 1.0724 | 0.8464 |
1.1472 | 31.2145 | 5400 | 1.0529 | 0.8404 |
1.1375 | 32.3710 | 5600 | 1.0506 | 0.8403 |
1.1276 | 33.5275 | 5800 | 1.0432 | 0.8398 |
1.1149 | 34.6841 | 6000 | 1.0371 | 0.8330 |
1.1099 | 35.8406 | 6200 | 1.0372 | 0.8341 |
1.0959 | 36.9971 | 6400 | 1.0296 | 0.8370 |
1.0838 | 38.1507 | 6600 | 1.0136 | 0.8232 |
1.0761 | 39.3072 | 6800 | 1.0355 | 0.8288 |
1.069 | 40.4638 | 7000 | 1.0072 | 0.8211 |
1.0624 | 41.6203 | 7200 | 1.0019 | 0.8217 |
1.0502 | 42.7768 | 7400 | 1.0021 | 0.8329 |
1.0423 | 43.9333 | 7600 | 0.9960 | 0.8153 |
1.0334 | 45.0870 | 7800 | 0.9903 | 0.8134 |
1.0203 | 46.2435 | 8000 | 0.9787 | 0.8116 |
1.0212 | 47.4 | 8200 | 0.9690 | 0.8029 |
1.0062 | 48.5565 | 8400 | 0.9864 | 0.8030 |
1.0029 | 49.7130 | 8600 | 0.9658 | 0.8000 |
0.9922 | 50.8696 | 8800 | 0.9552 | 0.7964 |
0.9784 | 52.0232 | 9000 | 0.9563 | 0.7978 |
0.9761 | 53.1797 | 9200 | 0.9442 | 0.7898 |
0.9649 | 54.3362 | 9400 | 0.9495 | 0.7898 |
0.9567 | 55.4928 | 9600 | 0.9448 | 0.7927 |
0.9556 | 56.6493 | 9800 | 0.9303 | 0.7851 |
0.9454 | 57.8058 | 10000 | 0.9304 | 0.7784 |
0.9356 | 58.9623 | 10200 | 0.9202 | 0.7718 |
0.927 | 60.1159 | 10400 | 0.9264 | 0.7730 |
0.9172 | 61.2725 | 10600 | 0.9252 | 0.7736 |
0.9177 | 62.4290 | 10800 | 0.9087 | 0.7682 |
0.9107 | 63.5855 | 11000 | 0.9119 | 0.7663 |
0.9017 | 64.7420 | 11200 | 0.9014 | 0.7609 |
0.899 | 65.8986 | 11400 | 0.8962 | 0.7597 |
0.8854 | 67.0522 | 11600 | 0.8976 | 0.7533 |
0.8841 | 68.2087 | 11800 | 0.8952 | 0.7554 |
0.8792 | 69.3652 | 12000 | 0.8951 | 0.7535 |
0.8697 | 70.5217 | 12200 | 0.8913 | 0.7513 |
0.8677 | 71.6783 | 12400 | 0.8820 | 0.7496 |
0.862 | 72.8348 | 12600 | 0.8834 | 0.7447 |
0.8573 | 73.9913 | 12800 | 0.8824 | 0.7437 |
0.8527 | 75.1449 | 13000 | 0.8747 | 0.7388 |
0.8451 | 76.3014 | 13200 | 0.8806 | 0.7399 |
0.8435 | 77.4580 | 13400 | 0.8713 | 0.7401 |
0.8393 | 78.6145 | 13600 | 0.8734 | 0.7387 |
0.8353 | 79.7710 | 13800 | 0.8702 | 0.7367 |
0.834 | 80.9275 | 14000 | 0.8661 | 0.7335 |
0.8265 | 82.0812 | 14200 | 0.8642 | 0.7312 |
0.8183 | 83.2377 | 14400 | 0.8638 | 0.7334 |
0.8238 | 84.3942 | 14600 | 0.8643 | 0.7311 |
0.8176 | 85.5507 | 14800 | 0.8640 | 0.7309 |
0.8183 | 86.7072 | 15000 | 0.8603 | 0.7294 |
0.8121 | 87.8638 | 15200 | 0.8586 | 0.7270 |
0.8033 | 89.0174 | 15400 | 0.8585 | 0.7265 |
0.8116 | 90.1739 | 15600 | 0.8560 | 0.7254 |
0.8058 | 91.3304 | 15800 | 0.8553 | 0.7262 |
0.7992 | 92.4870 | 16000 | 0.8548 | 0.7263 |
0.7979 | 93.6435 | 16200 | 0.8528 | 0.7236 |
0.7979 | 94.8 | 16400 | 0.8529 | 0.7235 |
0.7978 | 95.9565 | 16600 | 0.8526 | 0.7242 |
0.7934 | 97.1101 | 16800 | 0.8519 | 0.7238 |
0.7915 | 98.2667 | 17000 | 0.8520 | 0.7233 |
0.7996 | 99.4232 | 17200 | 0.8515 | 0.7230 |
Framework versions
- Transformers 4.49.0
- Pytorch 2.6.0+cu124
- Datasets 3.3.2
- Tokenizers 0.21.0
- Downloads last month
- 10
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for csikasote/wav2vec2-base-librispeech-model
Base model
facebook/wav2vec2-base-960h