metadata

library_name: transformers
license: apache-2.0
base_model: openai/whisper-large-v3-turbo
tags:
  - generated_from_trainer
metrics:
  - wer
model-index:
  - name: whisper-large-v3-turbo-ft-btb-cv-cy
    results: []
datasets:
  - techiaith/banc-trawsgrifiadau-bangor
  - techiaith/commonvoice_18_0_cy
language:
  - cy
pipeline_tag: automatic-speech-recognition

whisper-large-v3-turbo-ft-btb-cv-cy

This model is a version of openai/whisper-large-v3-turbo finedtuned with transcriptions of Welsh language spontaneous speech Banc Trawsgrifiadau Bangor (btb) ac well as recordings of read speach from Welsh Common Voice version 18 (cv) for additional training.

The Whisper large-v3-turbo pre-trained model is a finetuned version of a pruned Whisper large-v3. In other words, this model is the same model as techiaith/whisper-large-v3-ft-btb-cv-cy, except that the number of decoding layers have been reduced. As a result, the model is way faster, at the expense of a minor quality degradation.

It achieves the following results on the Banc Trawsgrifiadau Bangor'r test set

WER: 30.27
CER: 11.14

As such this model is suitable for faster verbatim transcribing of spontaneous or unplanned speech.

Usage

from transformers import pipeline

transcriber = pipeline("automatic-speech-recognition", model="techiaith/whisper-large-v3-turbo-ft-btb-cv-cy")
result = transcriber(<path or url to soundfile>)
print (result)

{'text': 'ymm, yn y pum mlynadd dwitha 'ma ti 'di... Ie. ...bod drw dipyn felly do?'}