Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,107 @@
|
|
1 |
-
---
|
2 |
-
license: mit
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
datasets:
|
4 |
+
- sameerbanchhor/CHATTISGARHI-TTS-F
|
5 |
+
language:
|
6 |
+
- hi
|
7 |
+
pipeline_tag: text-to-speech
|
8 |
+
tags:
|
9 |
+
- chattisgarhi
|
10 |
+
- chhattigarh
|
11 |
+
---
|
12 |
+
|
13 |
+
# Chhattisgarhi Text-to-Speech (TTS) Model — VITS Based
|
14 |
+
|
15 |
+
A deep learning-based text-to-speech (TTS) model for the **Chhattisgarhi language**, trained using the **VITS architecture**. This project is designed to make technology more accessible for the people of Chhattisgarh by converting Chhattisgarhi text into natural, regional speech.
|
16 |
+
|
17 |
+
> **Author**: Sameer Banchhor
|
18 |
+
> MSc Student at Hemchand Yadav University
|
19 |
+
> 📧 Email: [email protected]
|
20 |
+
> 🐙 GitHub: [@sameer-banchhor-git](https://github.com/sameer-banchhor-git)
|
21 |
+
> 🔗 LinkedIn: [Sameer Banchhor](https://www.linkedin.com/in/sameer-banchhor-4a7373323/)
|
22 |
+
|
23 |
+
---
|
24 |
+
|
25 |
+
## 🌟 Project Highlights
|
26 |
+
|
27 |
+
- 🔊 **Language Support**: Chhattisgarhi (regional dialects and tones)
|
28 |
+
- 🧠 **Model Architecture**: [VITS](https://github.com/jaywalnut310/vits) (Variational Inference Text-to-Speech)
|
29 |
+
- 🎯 **Goal**: Enable high-quality speech generation for educational, informational, and accessibility applications in Chhattisgarh
|
30 |
+
- 🛠️ **Current Status**: Model trained and functional; app development planned
|
31 |
+
|
32 |
+
---
|
33 |
+
|
34 |
+
## 📚 Dataset
|
35 |
+
|
36 |
+
**Primary Dataset**:
|
37 |
+
[CHATTISGARHI-TTS-F on Hugging Face](https://huggingface.co/datasets/sameerbanchhor/CHATTISGARHI-TTS-F) — curated and released by the author
|
38 |
+
|
39 |
+
**Data Composition**:
|
40 |
+
- Regional sentences in Chhattisgarhi, including various tones and expressions
|
41 |
+
- Transcriptions aligned with high-quality audio
|
42 |
+
- Data sourced from:
|
43 |
+
- YouTube spoken content (sentence-transcription pairs)
|
44 |
+
- IISE dataset for phonetic richness and clarity
|
45 |
+
|
46 |
+
**Sample Sentences with Regional Tone**:
|
47 |
+
1. _"राजस्थान के नामी ब्यंजन चूरमालाड़ू गुड़ के पाग म गहूँ के दरदरहा पिसान के लाड़ू म तिली अउ नरियल के सुवाद म सजथे"_
|
48 |
+
2. _"दुग्ध क्रान्ति भारत के योजना हे जेखर ले भारत म दूध के कमी ला दुरिहा करे जा सकथे एला श्वेत क्रांति घलोक कहिथे"_
|
49 |
+
3. _"जम्मू कश्मीर म पर्यटन उद्योग ला बढ़ावा देना उहाँ के अर्थबेवस्था ला सुचारू रूप ले चलाय बर जरुरी हे"_
|
50 |
+
|
51 |
+
---
|
52 |
+
|
53 |
+
## ⚙️ Model Training
|
54 |
+
|
55 |
+
| Component | Details |
|
56 |
+
|------------------|----------------------------------|
|
57 |
+
| Architecture | VITS (with adversarial training) |
|
58 |
+
| Framework | PyTorch |
|
59 |
+
| Audio Sampling | 22050 Hz (standard TTS rate) |
|
60 |
+
| Dataset Size | 27 GB | |
|
61 |
+
| Optimizer | Adam |
|
62 |
+
| GPU | NVIDIA RTX 3090 |
|
63 |
+
| Vocoder Used | Integrated with VITS |
|
64 |
+
| Text Normalization | Custom normalization for Devanagari |
|
65 |
+
|
66 |
+
---
|
67 |
+
|
68 |
+
## 🧪 Inference Example
|
69 |
+
|
70 |
+
```python
|
71 |
+
from models import Synthesizer
|
72 |
+
|
73 |
+
tts = Synthesizer(
|
74 |
+
checkpoint_path="models/chhattisgarhi_vits.pth",
|
75 |
+
config_path="configs/config.json"
|
76 |
+
)
|
77 |
+
|
78 |
+
tts.speak("जम्मू कश्मीर म पर्यटन उद्योग ला बढ़ावा देना उहाँ के अर्थबेवस्था ला सुचारू रूप ले चलाय बर जरुरी हे")
|
79 |
+
````
|
80 |
+
|
81 |
+
---
|
82 |
+
|
83 |
+
## 🛣️ Roadmap
|
84 |
+
|
85 |
+
* ✅ Train and test baseline VITS model on Chhattisgarhi dataset
|
86 |
+
* 🔄 Add support for tonal variation and emphasis
|
87 |
+
* 🧹 Improve preprocessing for low-resource sentence cleaning
|
88 |
+
* 📱 Develop full-featured TTS application (mobile/desktop/web)
|
89 |
+
* 🌐 Incorporate transliteration (for users who can’t type in Devanagari)
|
90 |
+
|
91 |
+
---
|
92 |
+
## 📄 License
|
93 |
+
|
94 |
+
MIT License (recommended for open-source contributions)
|
95 |
+
|
96 |
+
## 🤝 Contributing
|
97 |
+
|
98 |
+
If you're a linguist, developer, or enthusiast in low-resource languages and speech tech, contributions are welcome! Feel free to fork the repo or reach out directly.
|
99 |
+
|
100 |
+
---
|
101 |
+
|
102 |
+
## 🙋 Support & Contact
|
103 |
+
|
104 |
+
For any issues, suggestions, or collaborations:
|
105 |
+
|
106 |
+
📧 **Email**: [[email protected]](mailto:[email protected])
|
107 |
+
📍 **GitHub**: [github.com/sameer-banchhor-git](https://github.com/sameer-banchhor-git)
|