Open-Sarika

This is a speech recognition and translation model for Indian languages (Hindi, Gujarati, and Marathi). The model can transcribe speech in these languages and translate between them. This is an open-source implementation inspired by Sarvam AI's Sarika model.

Model Details

Model Description

  • Model type: Speech Recognition and Translation (based on Whisper architecture)
  • Language(s): Hindi (hi), Gujarati (gu), Marathi (mr)
  • License: MIT
  • Base Model: openai/whisper-large-v3

Uses

Direct Use

The model can be used for:

  1. Transcribing speech in Hindi, Gujarati, and Marathi
  2. Translating speech between these languages

Here's a simple example to get started:

from transformers import WhisperProcessor, WhisperForConditionalGeneration
import torch
import librosa

model_id = "theharshithh/open-sarika-v1"
device = "cuda" if torch.cuda.is_available() else "cpu"

# Load model and processor
processor = WhisperProcessor.from_pretrained(model_id)
model = WhisperForConditionalGeneration.from_pretrained(model_id).to(device)
model.config.forced_decoder_ids = None

# Load and process audio
audio_path = "your_audio.wav"
audio, rate = librosa.load(audio_path, sr=16000)

# Generate transcription
inputs = processor(audio, sampling_rate=16000, return_tensors="pt").to(device)
with torch.no_grad():
    output_ids = model.generate(**inputs)
transcription = processor.batch_decode(output_ids, skip_special_tokens=True)[0]

Training Data

The model was trained on a variety of datasets, including:

  • Project Vaani dataset: A large-scale Indian language collection project by the Indian Institute of Science (IISc) in collaboration with ARTPARK, funded by Google
  • High-quality speech recordings in Hindi, Gujarati, and Marathi from AI4Bharat
  • Real-world speech data from various sources

Hardware Requirements

  • Minimum RAM: 8GB
  • GPU: Recommended for faster inference
  • Storage: Model size is approximately 1.5GB

Model Card Contact

For issues and feedback, please create an issue on the model's repository: https://huggingface.co/theharshithh/open-sarika-v1

Github

Github Repo: https://github.com/theharshithh/open-sarika

Downloads last month
669
Safetensors
Model size
1.54B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for theharshithh/open-sarika

Finetuned
(553)
this model

Dataset used to train theharshithh/open-sarika