DewiBrynJones commited on
Commit
2a5e258
1 Parent(s): 93237c9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +53 -81
README.md CHANGED
@@ -1,81 +1,53 @@
1
- ---
2
- license: apache-2.0
3
- base_model: openai/whisper-large-v3
4
- tags:
5
- - generated_from_trainer
6
- datasets:
7
- - DewiBrynJones/commonvoice_18_0_cy
8
- metrics:
9
- - wer
10
- model-index:
11
- - name: whisper-large-v3-ft-cv-cy-train-all-plus-other-with-excluded
12
- results:
13
- - task:
14
- name: Automatic Speech Recognition
15
- type: automatic-speech-recognition
16
- dataset:
17
- name: DewiBrynJones/commonvoice_18_0_cy default
18
- type: DewiBrynJones/commonvoice_18_0_cy
19
- args: default
20
- metrics:
21
- - name: Wer
22
- type: wer
23
- value: 0.1676010974591435
24
- ---
25
-
26
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
27
- should probably proofread and complete it, then remove this comment. -->
28
-
29
- # whisper-large-v3-ft-cv-cy-train-all-plus-other-with-excluded
30
-
31
- This model is a fine-tuned version of [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) on the DewiBrynJones/commonvoice_18_0_cy default dataset.
32
- It achieves the following results on the evaluation set:
33
- - Loss: 0.3280
34
- - Wer: 0.1676
35
-
36
- ## Model description
37
-
38
- More information needed
39
-
40
- ## Intended uses & limitations
41
-
42
- More information needed
43
-
44
- ## Training and evaluation data
45
-
46
- More information needed
47
-
48
- ## Training procedure
49
-
50
- ### Training hyperparameters
51
-
52
- The following hyperparameters were used during training:
53
- - learning_rate: 1e-05
54
- - train_batch_size: 16
55
- - eval_batch_size: 16
56
- - seed: 42
57
- - gradient_accumulation_steps: 2
58
- - total_train_batch_size: 32
59
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
60
- - lr_scheduler_type: linear
61
- - lr_scheduler_warmup_steps: 500
62
- - training_steps: 5000
63
- - mixed_precision_training: Native AMP
64
-
65
- ### Training results
66
-
67
- | Training Loss | Epoch | Step | Validation Loss | Wer |
68
- |:-------------:|:------:|:----:|:---------------:|:------:|
69
- | 0.1583 | 1.4144 | 1000 | 0.2562 | 0.2062 |
70
- | 0.0675 | 2.8289 | 2000 | 0.2394 | 0.1849 |
71
- | 0.0113 | 4.2433 | 3000 | 0.2729 | 0.1722 |
72
- | 0.0036 | 5.6577 | 4000 | 0.3004 | 0.1705 |
73
- | 0.0012 | 7.0721 | 5000 | 0.3280 | 0.1676 |
74
-
75
-
76
- ### Framework versions
77
-
78
- - Transformers 4.44.0
79
- - Pytorch 2.4.0+cu121
80
- - Datasets 2.20.0
81
- - Tokenizers 0.19.1
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: openai/whisper-large-v3
4
+ tags:
5
+ - generated_from_trainer
6
+ - whisper
7
+ datasets:
8
+ - techiaith/commonvoice_18_0_cy
9
+ metrics:
10
+ - wer
11
+ model-index:
12
+ - name: whisper-large-v3-ft-cv-cy-train-all-plus-other-with-excluded
13
+ results:
14
+ - task:
15
+ name: Automatic Speech Recognition
16
+ type: automatic-speech-recognition
17
+ dataset:
18
+ name: DewiBrynJones/commonvoice_18_0_cy default
19
+ type: DewiBrynJones/commonvoice_18_0_cy
20
+ args: default
21
+ metrics:
22
+ - name: Wer
23
+ type: wer
24
+ value: 0.185
25
+ language:
26
+ - cy
27
+ pipeline_tag: automatic-speech-recognition
28
+ ---
29
+
30
+ # whisper-large-v3-ft-cv-cy
31
+
32
+ This model is a version of [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) fine-tuned with the
33
+ `train_all` and `other_with_excluded` custom splits from [techiaith/commonvoice_18_0_cy](https://huggingface.co/datasets/techiaith/commonvoice_18_0_cy)
34
+
35
+ It achieves the following results on the Common Voice for Welsh release 18's standard test set:
36
+
37
+ - WER: 18.50
38
+ - CER: 5.32
39
+
40
+ N.B. this model performs considerably worse on English language speech, but better on Welsh than a [bilingual model](https://huggingface.co/techiaith/whisper-large-v3-ft-cv-cy-en)
41
+
42
+
43
+ ## Usage
44
+
45
+ ```python
46
+ from transformers import pipeline
47
+
48
+ transcriber = pipeline("automatic-speech-recognition", model="techiaith/whisper-large-v3-ft-cv-cy")
49
+ result = transcriber(<path or url to soundfile>)
50
+ print (result)
51
+ ```
52
+
53
+ `{'text': 'Mae hen wlad fy nhadau yn annwyl i mi.'}`