MMS-TTS Tamil (Fine-tuned)

This model is a fine-tuned version of facebook/mms-tts specifically for Tamil (ta) using the Mozilla Common Voice dataset. It improves Tamil text-to-speech generation by adapting the base multilingual model to the specific phonetics and characteristics of Tamil language data.

Model Details

Model Description

This is a Tamil TTS model based on the Facebook MMS-TTS architecture. It was fine-tuned using Tamil speech-text pairs from the Mozilla Common Voice dataset to improve pronunciation, rhythm, and voice quality specifically for Tamil. This model is intended to support voice synthesis for conversational agents, accessibility tools, and Tamil voice assistants.

  • Model type: Text-to-Speech (TTS)
  • Language(s) (NLP): Tamil
  • Finetuned from model: facebook/mms-tts

Uses

Direct Use

This model can be used directly for converting Tamil text into natural-sounding Tamil speech. It's suitable for:

  • Assistive technologies for visually impaired Tamil speakers
  • Voice generation for Tamil content creators
  • Language learning tools

Bias, Risks, and Limitations

  • May reflect biases present in the Common Voice dataset.
  • Pronunciation accuracy may vary for regional dialects or names not present in the training data.
  • Limited emotional range due to dataset and architecture constraints.

How to Get Started with the Model



# Load model directly
from transformers import AutoTokenizer, AutoModelForTextToWaveform

tokenizer = AutoTokenizer.from_pretrained("Lingalingeswaran/facebook_mms_tamil")
model = AutoModelForTextToWaveform.from_pretrained("Lingalingeswaran/facebook_mms_tamil")



# Input text
import torch # import torch here
from scipy.io.wavfile import write # import write here
from IPython.display import Audio  # Only if you're using Jupyter
text = "இந்த மாணவர்கள் எப்போதும் இப்படித்தான்"
inputs = tokenizer(text, return_tensors="pt")

# Generate waveform
with torch.no_grad():
    output = model(**inputs).waveform

# Save waveform to a file
waveform = output.squeeze().cpu().numpy()
write("tamil_output.wav", rate=16000, data=waveform)  # 16kHz sample rate

# (Optional) Play audio in Jupyter Notebook
Audio("tamil_output.wav")
Downloads last month
16
Safetensors
Model size
36.3M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Lingalingeswaran/facebook_mms_tamil

Base model

facebook/mms-tts
Finetuned
(5)
this model