DewiBrynJones commited on
Commit
b8ac8c5
·
verified ·
1 Parent(s): d4dda56

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -8
README.md CHANGED
@@ -12,23 +12,28 @@ model-index:
12
  datasets:
13
  - techiaith/banc-trawsgrifiadau-bangor
14
  - techiaith/commonvoice_18_0_cy
 
 
15
  language:
16
  - cy
 
17
  pipeline_tag: automatic-speech-recognition
18
  ---
19
 
20
- # whisper-large-v3-ft-btb-cv-cy
21
 
22
  This model is a version of [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) finedtuned with
23
- transcriptions of Welsh language spontaneous speech [Banc Trawsgrifiadau Bangor (btb)](https://huggingface.co/datasets/techiaith/banc-trawsgrifiadau-bangor)
24
- ac well as recordings of read speach from [Welsh Common Voice version 18 (cv)](https://huggingface.co/datasets/techiaith/commonvoice_18_0_cy)
25
- for additional training.
 
 
26
 
27
- As such this model is suitable for more verbatim transcribing of spontaneous or unplanned speech.
28
- It achieves the following results on the [Banc Trawsgrifiadau Bangor'r test set](https://huggingface.co/datasets/techiaith/banc-trawsgrifiadau-bangor/viewer/default/test)
29
 
30
- - WER: 29.72
31
- - CER: 11.01
32
 
33
 
34
  ## Usage
 
12
  datasets:
13
  - techiaith/banc-trawsgrifiadau-bangor
14
  - techiaith/commonvoice_18_0_cy
15
+ - techiaith/commonvoice_vad_cy
16
+ - cymen-arfor/lleisiau-arfor
17
  language:
18
  - cy
19
+ - en
20
  pipeline_tag: automatic-speech-recognition
21
  ---
22
 
23
+ # whisper-large-v3-ft-verbatim-cy-en
24
 
25
  This model is a version of [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) finedtuned with
26
+ transcriptions of Welsh language spontaneous speech from
27
+ [Banc Trawsgrifiadau Bangor (btb)](https://huggingface.co/datasets/techiaith/banc-trawsgrifiadau-bangor) and
28
+ [Lleisiau Arfor](https://huggingface.co/datasets/cymen-arfor/lleisiau-arfor) as well as recordings of read speech
29
+ from [Welsh Common Voice version 18 (cv)](https://huggingface.co/datasets/techiaith/commonvoice_18_0_cy) and
30
+ [Welsh Common Voice Vad Segments](https://huggingface.co/datasets/techiaith/commonvoice_vad_cy) for additional training.
31
 
32
+ As such this model is suitable for more verbatim transcribing of spontaneous or unplanned speech. It achieves the
33
+ following results on the [Banc Trawsgrifiadau Bangor'r test set](https://huggingface.co/datasets/techiaith/banc-trawsgrifiadau-bangor/viewer/default/test)
34
 
35
+ - WER: 28.99
36
+ - CER: 10.27
37
 
38
 
39
  ## Usage