United Medical ASR Indic (Accelerated Training)

This model is a fine-tuned version of distil-whisper/distil-large-v3.5 for multilingual medical speech recognition across Indian languages, trained using HuggingFace Accelerate for optimal performance.

Performance Optimizations

  • Multi-GPU Training: Distributed training across available GPUs
  • Mixed Precision: FP16 training for 2x speed improvement
  • Gradient Accumulation: Effective large batch sizes
  • Memory Optimization: Gradient checkpointing and efficient data loading
  • Batch Processing: Locations processed in optimized batches

Model Description

  • Base Model: distil-whisper/distil-large-v3.5
  • Languages: Multiple Indian languages (Hindi, English, Bengali, Tamil, Telugu, Gujarati, Kannada, Malayalam, Marathi, Punjabi, Urdu)
  • Domain: Medical ASR for Indic languages
  • Training: Accelerated sequential fine-tuning with batch processing

Usage

from transformers import WhisperProcessor, WhisperForConditionalGeneration
import torch

processor = WhisperProcessor.from_pretrained("abar-uwc/united-med-asr-indic")
model = WhisperForConditionalGeneration.from_pretrained("abar-uwc/united-med-asr-indic")

# The model will automatically detect and transcribe in the appropriate language
Downloads last month
47
Safetensors
Model size
756M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for abar-uwc/united-med-asr-indic

Finetuned
(4)
this model