kostissz commited on
Commit
ae19d9c
·
verified ·
1 Parent(s): df4a700

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +32 -77
README.md CHANGED
@@ -1,86 +1,41 @@
1
  ---
2
- library_name: transformers
3
- license: mit
4
  base_model: openai/whisper-large-v3-turbo
5
- tags:
6
- - generated_from_trainer
7
- metrics:
8
- - wer
 
9
  model-index:
10
- - name: whisper-large-v3-turbo-bn
11
- results: []
 
 
 
 
 
 
 
 
 
12
  ---
13
 
14
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
- should probably proofread and complete it, then remove this comment. -->
16
-
17
- # whisper-large-v3-turbo-bn
18
-
19
- This model is a fine-tuned version of [openai/whisper-large-v3-turbo](https://huggingface.co/openai/whisper-large-v3-turbo) on the None dataset.
20
- It achieves the following results on the evaluation set:
21
- - Loss: 0.1089
22
- - Model Preparation Time: 0.0054
23
- - Wer Ortho: 26.4357
24
- - Wer: 11.0532
25
- - Cer Ortho: 7.5370
26
- - Cer: 6.0587
27
-
28
- ## Model description
29
-
30
- More information needed
31
-
32
- ## Intended uses & limitations
33
-
34
- More information needed
35
-
36
- ## Training and evaluation data
37
-
38
- More information needed
39
-
40
- ## Training procedure
41
-
42
- ### Training hyperparameters
43
-
44
- The following hyperparameters were used during training:
45
- - learning_rate: 1e-05
46
- - train_batch_size: 64
47
- - eval_batch_size: 64
48
- - seed: 42
49
- - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
50
- - lr_scheduler_type: linear
51
- - lr_scheduler_warmup_steps: 50
52
- - training_steps: 2000
53
- - mixed_precision_training: Native AMP
54
-
55
- ### Training results
56
 
57
- | Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Wer Ortho | Wer | Cer Ortho | Cer |
58
- |:-------------:|:------:|:----:|:---------------:|:----------------------:|:---------:|:-------:|:---------:|:-------:|
59
- | 0.3227 | 0.2985 | 100 | 0.1940 | 0.0054 | 51.5425 | 24.7080 | 15.4053 | 12.4001 |
60
- | 0.1585 | 0.5970 | 200 | 0.1541 | 0.0054 | 43.2052 | 20.5941 | 13.2261 | 10.8048 |
61
- | 0.1296 | 0.8955 | 300 | 0.1284 | 0.0054 | 37.8651 | 17.2373 | 11.1337 | 9.0344 |
62
- | 0.0984 | 1.1940 | 400 | 0.1181 | 0.0054 | 35.7179 | 15.8488 | 10.4589 | 8.4411 |
63
- | 0.0831 | 1.4925 | 500 | 0.1117 | 0.0054 | 33.2943 | 14.7721 | 9.6520 | 7.8391 |
64
- | 0.0778 | 1.7910 | 600 | 0.1051 | 0.0054 | 31.8858 | 13.8226 | 9.1472 | 7.3725 |
65
- | 0.0676 | 2.0896 | 700 | 0.1034 | 0.0054 | 30.0816 | 13.0923 | 8.6039 | 6.9990 |
66
- | 0.0498 | 2.3881 | 800 | 0.0993 | 0.0054 | 29.2819 | 12.6100 | 8.3244 | 6.7352 |
67
- | 0.0487 | 2.6866 | 900 | 0.0960 | 0.0054 | 28.8242 | 12.4199 | 8.3398 | 6.6952 |
68
- | 0.0473 | 2.9851 | 1000 | 0.0946 | 0.0054 | 28.5619 | 12.1879 | 8.1810 | 6.6184 |
69
- | 0.0322 | 3.2836 | 1100 | 0.0994 | 0.0054 | 27.7306 | 11.7283 | 7.8936 | 6.3322 |
70
- | 0.0304 | 3.5821 | 1200 | 0.0974 | 0.0054 | 27.9168 | 11.8686 | 8.0736 | 6.4797 |
71
- | 0.0304 | 3.8806 | 1300 | 0.0956 | 0.0054 | 27.2904 | 11.4362 | 7.7514 | 6.2139 |
72
- | 0.0228 | 4.1791 | 1400 | 0.1023 | 0.0054 | 26.9930 | 11.2349 | 7.6544 | 6.1286 |
73
- | 0.0179 | 4.4776 | 1500 | 0.0998 | 0.0054 | 26.7448 | 11.1543 | 7.6114 | 6.1014 |
74
- | 0.0175 | 4.7761 | 1600 | 0.1014 | 0.0054 | 26.7975 | 11.1427 | 7.5925 | 6.0777 |
75
- | 0.0163 | 5.0746 | 1700 | 0.1075 | 0.0054 | 26.7530 | 11.1690 | 7.6298 | 6.1284 |
76
- | 0.01 | 5.3731 | 1800 | 0.1086 | 0.0054 | 26.5434 | 11.1396 | 7.5930 | 6.1084 |
77
- | 0.0097 | 5.6716 | 1900 | 0.1084 | 0.0054 | 26.5446 | 11.0813 | 7.5709 | 6.0733 |
78
- | 0.0096 | 5.9701 | 2000 | 0.1089 | 0.0054 | 26.4357 | 11.0532 | 7.5370 | 6.0587 |
79
 
 
80
 
81
- ### Framework versions
 
 
 
 
 
82
 
83
- - Transformers 4.49.0
84
- - Pytorch 2.6.0+cu124
85
- - Datasets 3.3.2
86
- - Tokenizers 0.21.0
 
 
 
1
  ---
 
 
2
  base_model: openai/whisper-large-v3-turbo
3
+ datasets:
4
+ - bn
5
+ language: bn
6
+ library_name: transformers
7
+ license: apache-2.0
8
  model-index:
9
+ - name: Finetuned openai/whisper-large-v3-turbo on Bengali
10
+ results:
11
+ - task:
12
+ type: automatic-speech-recognition
13
+ name: Speech-to-Text
14
+ dataset:
15
+ name: Common Voice (Bengali)
16
+ type: common_voice
17
+ metrics:
18
+ - type: wer
19
+ value: 11.053
20
  ---
21
 
22
+ # Finetuned openai/whisper-large-v3-turbo on 21409 Bengali training audio samples from cv-corpus-21.0-2025-03-14/bn.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
 
24
+ This model was created from the Mozilla.ai Blueprint:
25
+ [speech-to-text-finetune](https://github.com/mozilla-ai/speech-to-text-finetune).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
 
27
+ ## Evaluation results on 9363 audio samples of Bengali:
28
 
29
+ ### Baseline model (before finetuning) on Bengali
30
+ - Word Error Rate (Normalized): 78.843
31
+ - Word Error Rate (Orthographic): 107.027
32
+ - Character Error Rate (Normalized): 62.521
33
+ - Character Error Rate (Orthographic): 72.012
34
+ - Loss: 1.074
35
 
36
+ ### Finetuned model (after finetuning) on Bengali
37
+ - Word Error Rate (Normalized): 11.053
38
+ - Word Error Rate (Orthographic): 26.436
39
+ - Character Error Rate (Normalized): 6.059
40
+ - Character Error Rate (Orthographic): 7.537
41
+ - Loss: 0.109