You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Whisper Small fine-tuned for Kannada using IAST Romanization via Aksharamukha, addressing token limits in non-Roman scripts.

This is a Whisper Small model fine-tuned for Kannada automatic speech recognition (ASR). The original model has a token limit of 448, which makes it less efficient for non-Roman scripts like Kannada. To improve this, we fine-tuned the model with Romanized Kannada text using IAST (International Alphabet of Sanskrit Transliteration), generated via the Aksharamukha library. This approach has reduced sequence lengths, resulting in a 2x faster inference speed.

Example:

Kannada Tokens: ['à²', '¹', 'à²', '¾', 'à²', '°', 'à³į', 'à²', '¦', 'à²', '¿', 'à²', 'ķ', 'Ġà²', '¶', 'à³', 'ģ', 'à²', 'Ń', 'à²', '¾', 'à²', '¶', 'à²', '¯', 'à²', 'Ĺ', 'à²', '³', 'à³', 'ģ']

Kannada Token Count: 31

IAST Tokens: ['h', 'Äģ', 'rd', 'ika', 'ĠÅĽ', 'ub', 'h', 'Äģ', 'ÅĽ', 'ay', 'ag', 'al', 'Ì', '¤', 'u']

IAST Token Count: 15

Romanized Kannada text uses fewer tokens (15) compared to the original Kannada text (31), resulting in faster processing.

Performance

  • Test WER: 28.97%
  • Test CER: 5.66%
  • Test WER WITH BEAM SEARCH And NORMALIZATION: 23.12%
  • Test CER WITH BEAM SEARCH AND NORMALIZATION: 4.95%

Usage

#!pip install whisper_transcriber aksharamukha
from whisper_transcriber import WhisperTranscriber
from aksharamukha import transliterate

# Initialize the transcriber
transcriber = WhisperTranscriber(model_name="coild/whisper_small_kannada_translit_IAST")

# Transcribe an audio file with automatic transcript printing
results = transcriber.transcribe(
    "audio_file.mp3",
    min_segment=25,
    max_segment=30,
    silence_duration=0.2,
    sample_rate=16000,
    batch_size=4,
    normalize=True,
    normalize_text=True,
    verbose=False
)

# Apply transliteration to all results
for segment in results:
    print(f"\n[{segment['start']} --> {segment['end']}]")
    print(transliterate.process('IAST', 'Kannada', segment['transcript']))

Model Details

Model Description

  • Developed by: Ranjan Shettigar
  • Language(s) (NLP): kn
  • Finetuned from model [OpenAI]: whipser-small
  • Repository: [More Information Needed]
  • Paper [optional]: [More Information Needed]
  • Demo [optional]: [More Information Needed]

Training Details

Training and evaluation data

Training Data:

Evaluation Data:

Training Hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • optimizer: adamw
  • epochs: 4

Citation [optional]

BibTeX:

[More Information Needed]

Downloads last month
2
Safetensors
Model size
242M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for coild/whisper_small_kannada_translit_IAST

Finetuned
(2788)
this model

Evaluation results