Malayalam TTS Model (LFM2-350M Fine-tuned)
This repository contains a fine-tuned Malayalam Text-to-Speech (TTS) model based on LFM2-350M, trained using VyvoTTS (LLM-based TTS framework) and Unsloth.
Malayalam TTS โ 24 kHz (LLM + SNAC Codec)
High-quality Malayalam text-to-speech model targeting natural pronunciation and clean prosody at 24 kHz, using a discrete audio codec (SNAC 24 kHz) for waveform reconstruction. Designed for lightweight deployment (~350M parameters) with GPU/CPU support.
Status: v0.1 โ stable inference, strong pronunciation, limited emotional expressiveness. Roadmap includes expressive styles and nonโverbal cues (laughter, giggles, breaths).
โจ Highlights
Language: Malayalam (with support for basic English loanwords).
Sample Rate: 24 kHz, mono.
Codec: [SNAC 24 kHz] for fast decoding.
Model Size: ~350M parameters (small/efficient).
Strengths: Clear, nonโrobotic pronunciation; punctuationโaware phrasing.
Known Limits: Emotion range is narrow; limited style transfer; no speaker cloning in v0.1.
๐ Model Details
- Base Model: LFM2-350M
- Language: Malayalam
- Dataset: ai4bharat/rasa (Malayalam subset)
- Training: 10 epochs, ~77k steps
- Frameworks Used: VyvoTTS, Unsloth
๐ฎ Future Work
- Emotion and expressive style support
- Non-verbal cues (laughter, giggles, breaths)
- Multi-speaker extension
- Downloads last month
- 35