Whisper-Tiny Welsh / Cy
This is a whisper-tiny model based on techiaith/whisper-tiny-ft-cy-en, fine tuned using the ACFT method, for use on Android phones.
This model can be loaded into the FUTO Keyboard, and most likely other similar keyboards (Heliboard, Florisboard, AnySoftKeyboard, possibly even Swiftkey). More info on this can be found here.
Android Installation Instructions
To use this model with FUTO keyboard:
- Download the .bin file from download/whisper-tiny-welsh.bin onto your Android phone
- Either click the file and choose 'Open with FUTO keyboard' or open FUTO and go to Languages & Models > Add Language > Welsh and then under 'Voice Input Model' click to open the .bin file
- From within FUTO keyboard, click the microphone, 'siarad Cymraeg' and see the results.
Training and evaluation data
Trained/evaluated using welsh-transcription-samples, a subset of Mozilla's Common Voice CY dataset. Smaller and more useful for poor-man's training without a GPU. Training on the full Mozilla Common Voice corpus may provide better results.
Model Setup
# Training hyperparameters
LEARNING_RATE = 1e-6
NUM_EPOCHS = 8
# (Note: only recordings < 29.0s were used)
WER & CER
Dataset | WER | CER |
---|---|---|
CommonVoice CY (ClemSummer, validation split) | 62.99 | 21.46 |
techiaith/banc-trawsgrifiadau-bangor | TODO: no access | TODO: no access |
Thanks to techiaith and ClemSummer for their prior work. Diolch
- Downloads last month
- 31
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for pjrobertson/whisper-tiny-welsh-cy
Base model
openai/whisper-tiny