gs224
/

whisper-tiny-polish

Automatic Speech Recognition

Model card Files Files and versions Community

whisper-tiny-polish / README.md

gs224's picture

Update README.md

d042d62 verified 4 months ago

|

history blame contribute delete

2.8 kB

	---
	library_name: transformers
	datasets:
	- FBK-MT/Speech-MASSIVE
	language:
	- pl
	metrics:
	- wer
	- bleu
	base_model:
	- openai/whisper-tiny
	pipeline_tag: automatic-speech-recognition
	---

	# Model Card

	## Model Details

	### Model Description

	This model is a fine-tuned version of OpenAI's Whisper-Tiny ASR model,
	optimized for transcribing Polish voice commands. The fine-tuning process
	utilized the MASSIVE Speech dataset to enhance the model's performance
	on Polish utterances. The Whisper-Tiny model is a transformer-based
	encoder-decoder architecture, pre-trained on 680,000 hours of labeled speech data.

	- Developed by: gs224
	- Language(s) (NLP): Polish
	- Finetuned from model: Whisper-tiny

	Link to the training code: https://github.com/gs224/Fine-tuning-Whisper-for-Polish-voice-commands

	## Uses

	The model can be used for automatic transcription of Polish speech-to-text tasks, including voice command recognition.

	### Out-of-Scope Use

	The model may not perform well on languages or domains it was not fine-tuned for, and it is not suitable for sensitive applications requiring very high accuracy.

	## Bias, Risks, and Limitations

	The fine-tuning was performed on a relatively small subset of Polish voice data
	with limited epochs, leading to potential underperformance in certain dialects or accents.
	The presence of capital letters and punctuation in the ground-truth transcription
	may affect the Word Error Rate (WER) score.

	### Recommendations

	Future improvements could include training on larger datasets, more diverse utterances,
	and addressing case sensitivity and punctuation in ground-truth labels.

	## Training Details

	### Training Data

	https://huggingface.co/datasets/FBK-MT/Speech-MASSIVE-test

	## Evaluation

	Word Error Rate (WER)

	### Testing Data, Factors & Metrics


	#### Metrics

	WER, a typical metrics for ASR.

	### Results

	Word Error Rate on the test set:

	\| Base model \| Fine-tuned model \|
	\|------------\|------------------\|
	\| 0.8435 \| 0.3176 \|

	Example sentences:

	\| Reference \| Base model \| Fine-tuned model \|
	\|-----------\|------------\|------------------\|
	\| wyślij maila do mojego brata i przypomnij o rocznicy ślubu \| wysli myę latą mojego biata i przypamni o nici ślubu \| wyślij maila do mojego bryata i przypomnij mi o lepszy ślubu \|
	\| przypomnij mi o jutrzejszym spotkaniu godzinę wcześniej \| przypomnij mi o jutrzejszym spotkaniu godzinę wcześniej \| przypomnij mi o jutrzejszym spotkaniu godzina wcześniej \|
	\| graj plejlistę boba dylana \| gra i play listę boba dylana \| graj playlistę boba delana \|
	\| graj ale jazz autorki sanah \| grei, al het rust autoorkisana \| graj ale jazz autorki sanah \|
	\| olly posłuchajmy sto jeden i trzy f. m. \| oli posłuchajmy sto jeden i trzefam \| olly posłuchaj we z to jeden i trzy f. m. \|