Edit model card

libri-alpha-0.85-Temp-1-processor-change

This model is a distilled version of Wav2vec2 on the 30% of the Librispeech-clean.100 dataset. It achieves the following results on the evaluation set:

  • Loss: 78.4467
  • Wer: 0.1153

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Knowledge distillation of Wav2vec2-base-960h teacher model with 6 attention layers for student model.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 30
  • mixed_precision_training: Native AMP
  • alpha: 0.75(ignore name of repo)
  • temperature: 1

Training results

Training Loss Epoch Step Validation Loss Wer
493.9213 0.75 100 145.7981 0.1515
410.8468 1.49 200 119.1579 0.1498
368.5187 2.24 300 109.7572 0.1505
329.7762 2.99 400 99.2350 0.1439
323.7352 3.73 500 92.1173 0.1356
305.1129 4.48 600 89.3685 0.1314
294.2529 5.22 700 88.3937 0.1287
284.5355 5.97 800 87.0589 0.1292
284.2181 6.72 900 86.4474 0.1298
273.915 7.46 1000 84.6149 0.1265
267.7668 8.21 1100 84.1840 0.1264
262.1592 8.96 1200 83.8678 0.1253
262.5562 9.7 1300 83.2756 0.1207
262.9982 10.45 1400 81.8095 0.1218
256.2891 11.19 1500 82.1241 0.1204
251.4134 11.94 1600 80.8432 0.1207
250.0854 12.69 1700 81.1467 0.1203
250.0077 13.43 1800 80.9370 0.1196
239.0915 14.18 1900 80.5060 0.1201
240.9192 14.93 2000 80.4557 0.1190
241.1668 15.67 2100 80.6453 0.1203
244.9744 16.42 2200 80.0101 0.1192
232.4748 17.16 2300 79.4798 0.1170
237.3503 17.91 2400 79.5743 0.1175
237.9698 18.66 2500 79.3368 0.1178
235.8808 19.4 2600 79.5519 0.1174
230.8314 20.15 2700 79.0367 0.1166
229.5856 20.9 2800 79.1809 0.1172
233.1034 21.64 2900 78.9896 0.1167
231.6986 22.39 3000 78.7184 0.1154
222.0106 23.13 3100 78.7308 0.1160
225.1484 23.88 3200 78.6649 0.1159
232.4254 24.63 3300 78.5096 0.1154
230.9492 25.37 3400 78.4873 0.1153
228.3062 26.12 3500 78.5155 0.1147
225.5572 26.87 3600 78.5693 0.1148
227.7358 27.61 3700 78.5487 0.1149
221.2486 28.36 3800 78.4307 0.1151
231.5915 29.1 3900 78.4270 0.1153
231.7214 29.85 4000 78.4467 0.1153

Framework versions

  • Transformers 4.25.1
  • Pytorch 1.12.1
  • Datasets 2.7.1
  • Tokenizers 0.11.0
Downloads last month
5
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train rohitp1/libri-alpha-0.85-Temp-1-processor-change