metadata

license: apache-2.0
base_model: openai/whisper-large-v3
tags:
  - generated_from_trainer
  - whisper
datasets:
  - techiaith/commonvoice_18_0_cy
metrics:
  - wer
model-index:
  - name: whisper-large-v3-ft-cv-cy-train-all-plus-other-with-excluded
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: DewiBrynJones/commonvoice_18_0_cy default
          type: DewiBrynJones/commonvoice_18_0_cy
          args: default
        metrics:
          - name: Wer
            type: wer
            value: 0.185
language:
  - cy
pipeline_tag: automatic-speech-recognition

whisper-large-v3-ft-cv-cy

This model is a version of openai/whisper-large-v3 fine-tuned with the train_all and other_with_excluded custom splits from techiaith/commonvoice_18_0_cy

It achieves the following results on the Common Voice for Welsh release 18's standard test set:

WER: 18.50
CER: 5.32

N.B. this model performs considerably worse on English language speech, but better on Welsh than a bilingual model

Usage

from transformers import pipeline

transcriber = pipeline("automatic-speech-recognition", model="techiaith/whisper-large-v3-ft-cv-cy")
result = transcriber(<path or url to soundfile>)
print (result)

{'text': 'Mae hen wlad fy nhadau yn annwyl i mi.'}