ARTPARK-IISc
/

whisper-small-vaani-kannada

Automatic Speech Recognition

Model card Files Files and versions Community

whisper-small-vaani-kannada / README.md

SujithPulikodan's picture

SujithPulikodan

Update README.md

65a803d verified 29 days ago

|

history blame contribute delete

1.47 kB

	---
	license: apache-2.0
	datasets:
	- ARTPARK-IISc/Vaani
	language:
	- hi
	base_model:
	- openai/whisper-small
	pipeline_tag: automatic-speech-recognition
	---


	# Whisper-small-vaani-kannada

	This is a fine-tuned version of [OpenAI's Whisper-Small](https://huggingface.co/openai/whisper-small), trained on Kannada speech from multiple datasets.

	# Usage
	This can be used with the pipeline function from the Transformers module.
	```python

	import torch
	from transformers import pipeline

	audio = "path to the audio file to be transcribed"
	device = "cuda:0" if torch.cuda.is_available() else "cpu"
	modelTags="ARTPARK-IISc/whisper-small-vaani-kannada"
	transcribe = pipeline(task="automatic-speech-recognition", model=modelTags, chunk_length_s=30, device=device)
	transcribe.model.config.forced_decoder_ids = transcribe.tokenizer.get_decoder_prompt_ids(language="ka", task="transcribe")

	print('Transcription: ', transcribe(audio)["text"])

	```
	# Training and Evaluation

	The models has finetuned using folllowing dataset [Vaani](https://huggingface.co/datasets/ARTPARK-IISc/Vaani) , [Fleurs](https://huggingface.co/datasets/google/fleurs),[IndicTTS](https://huggingface.co/datasets/SPRINGLab/IndicTTS-Hindi)


	The performance of the model was evaluated using multiple datasets, and the evaluation results are provided below.

	\| Dataset \| WER \|
	\| :---: \| :---: \|
	\| Fleurs \| 29.16 \|
	\| IndicTTS \| 15.27 \|
	\| Kathbath \| 33.94 \|
	\| Kathbath Noisy\| 38.46 \|
	\| Vaani \| 69.78 \|