whisper-tiny-polish / README.md
gs224's picture
Update README.md
d042d62 verified
metadata
library_name: transformers
datasets:
  - FBK-MT/Speech-MASSIVE
language:
  - pl
metrics:
  - wer
  - bleu
base_model:
  - openai/whisper-tiny
pipeline_tag: automatic-speech-recognition

Model Card

Model Details

Model Description

This model is a fine-tuned version of OpenAI's Whisper-Tiny ASR model, optimized for transcribing Polish voice commands. The fine-tuning process utilized the MASSIVE Speech dataset to enhance the model's performance on Polish utterances. The Whisper-Tiny model is a transformer-based encoder-decoder architecture, pre-trained on 680,000 hours of labeled speech data.

  • Developed by: gs224
  • Language(s) (NLP): Polish
  • Finetuned from model: Whisper-tiny

Link to the training code: https://github.com/gs224/Fine-tuning-Whisper-for-Polish-voice-commands

Uses

The model can be used for automatic transcription of Polish speech-to-text tasks, including voice command recognition.

Out-of-Scope Use

The model may not perform well on languages or domains it was not fine-tuned for, and it is not suitable for sensitive applications requiring very high accuracy.

Bias, Risks, and Limitations

The fine-tuning was performed on a relatively small subset of Polish voice data with limited epochs, leading to potential underperformance in certain dialects or accents. The presence of capital letters and punctuation in the ground-truth transcription may affect the Word Error Rate (WER) score.

Recommendations

Future improvements could include training on larger datasets, more diverse utterances, and addressing case sensitivity and punctuation in ground-truth labels.

Training Details

Training Data

https://huggingface.co/datasets/FBK-MT/Speech-MASSIVE-test

Evaluation

Word Error Rate (WER)

Testing Data, Factors & Metrics

Metrics

WER, a typical metrics for ASR.

Results

Word Error Rate on the test set:

Base model Fine-tuned model
0.8435 0.3176

Example sentences:

Reference Base model Fine-tuned model
wyślij maila do mojego brata i przypomnij o rocznicy ślubu wysli myę latą mojego biata i przypamni o nici ślubu wyślij maila do mojego bryata i przypomnij mi o lepszy ślubu
przypomnij mi o jutrzejszym spotkaniu godzinę wcześniej przypomnij mi o jutrzejszym spotkaniu godzinę wcześniej przypomnij mi o jutrzejszym spotkaniu godzina wcześniej
graj plejlistę boba dylana gra i play listę boba dylana graj playlistę boba delana
graj ale jazz autorki sanah grei, al het rust autoorkisana graj ale jazz autorki sanah
olly posłuchajmy sto jeden i trzy f. m. oli posłuchajmy sto jeden i trzefam olly posłuchaj we z to jeden i trzy f. m.