whisper_large_v3_turbo_finetuned_en_id_v1
This model is a fine-tuned version of openai/whisper-large-v3-turbo on the None dataset. It achieves the following results on the evaluation set:
- eval_loss: 0.1362
- eval_runtime: 263.9538
- eval_samples_per_second: 5.906
- eval_steps_per_second: 5.906
- epoch: 3 (checkpoint model)
- step: 687 (checkpoint model)
Training and Evaluation Data
Step | Epoch | Training Loss | Eval Loss | Grad Norm | Learning Rate | Eval Samples/s | Eval Steps/s | Eval Runtime (s) |
---|---|---|---|---|---|---|---|---|
229 | 1.00 | 0.3841 | 0.1563 | 2.216 | 9.77e-06 | 5.91 | 5.91 | 263.95 |
458 | 2.00 | 0.1163 | 0.1449 | 1.855 | 9.36e-06 | 5.91 | 5.91 | 263.95 |
687 | 3.00 | 0.0693 | 0.1363 | 1.538 | 8.96e-06 | 5.91 | 5.91 | 263.95 |
917 | 4.00 | 0.0408 | 0.1376 | 1.367 | 8.55e-06 | 5.89 | 5.89 | 264.59 |
1146 | 5.00 | 0.0229 | 0.1398 | 0.982 | 8.14e-06 | 5.90 | 5.90 | 264.43 |
1375 | 6.00 | 0.0126 | 0.1450 | 0.936 | 7.73e-06 | 5.91 | 5.91 | 263.95 |
1604 | 7.00 | 0.0082 | 0.1476 | 0.362 | 7.33e-06 | 5.91 | 5.91 | 263.95 |
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 8
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 100
- num_epochs: 25
Framework versions
- Transformers 4.41.2
- Pytorch 2.2.2+cu121
- Datasets 2.18.0
- Tokenizers 0.19.1
- Downloads last month
- 8
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support