Open-Sarika
This is a speech recognition and translation model for Indian languages (Hindi, Gujarati, and Marathi). The model can transcribe speech in these languages and translate between them. This is an open-source implementation inspired by Sarvam AI's Sarika model.
Model Details
Model Description
- Model type: Speech Recognition and Translation (based on Whisper architecture)
- Language(s): Hindi (hi), Gujarati (gu), Marathi (mr)
- License: MIT
- Base Model: openai/whisper-large-v3
Uses
Direct Use
The model can be used for:
- Transcribing speech in Hindi, Gujarati, and Marathi
- Translating speech between these languages
Here's a simple example to get started:
from transformers import WhisperProcessor, WhisperForConditionalGeneration
import torch
import librosa
model_id = "theharshithh/open-sarika-v1"
device = "cuda" if torch.cuda.is_available() else "cpu"
# Load model and processor
processor = WhisperProcessor.from_pretrained(model_id)
model = WhisperForConditionalGeneration.from_pretrained(model_id).to(device)
model.config.forced_decoder_ids = None
# Load and process audio
audio_path = "your_audio.wav"
audio, rate = librosa.load(audio_path, sr=16000)
# Generate transcription
inputs = processor(audio, sampling_rate=16000, return_tensors="pt").to(device)
with torch.no_grad():
output_ids = model.generate(**inputs)
transcription = processor.batch_decode(output_ids, skip_special_tokens=True)[0]
Training Data
The model was trained on a variety of datasets, including:
- Project Vaani dataset: A large-scale Indian language collection project by the Indian Institute of Science (IISc) in collaboration with ARTPARK, funded by Google
- High-quality speech recordings in Hindi, Gujarati, and Marathi from AI4Bharat
- Real-world speech data from various sources
Hardware Requirements
- Minimum RAM: 8GB
- GPU: Recommended for faster inference
- Storage: Model size is approximately 1.5GB
Model Card Contact
For issues and feedback, please create an issue on the model's repository: https://huggingface.co/theharshithh/open-sarika-v1
Github
Github Repo: https://github.com/theharshithh/open-sarika
- Downloads last month
- 669
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for theharshithh/open-sarika
Base model
openai/whisper-large-v3