Indonesian TTS Documentation

This documentation provides a step-by-step guide on setting up and using the Indonesian Text-to-Speech (TTS) system based on a pretrained model. The instructions cover downloading necessary files, installing required packages, and running a script to synthesize speech from text.

Prerequisites

Ensure you have wget, pip, and pip3 installed on your system.

Steps

1. Download the Pretrained Model and Configuration Files

Use the following commands to download the necessary files:

Download from this acul3/TTS-TESTV3/upload/main

2. Install Required Packages

Install the TTS library and the Indonesian Grapheme-to-Phoneme (G2P) converter:

!pip install TTS
!pip3 install -U git+https://github.com/acul3/g2p-id

3. Import Libraries

Import the necessary libraries for TTS and G2P:

from TTS.api import TTS
import torch
from TTS.utils.synthesizer import Synthesizer
from g2p_id import G2P

4. Check Device

Check if a GPU is available and set the device accordingly:

device = "cuda" if torch.cuda.is_available() else "cpu"

5. Initialize G2P

Initialize the Indonesian G2P converter:

g2p = G2P()

6. Prepare Text

Convert the input text to phonemes:

text = g2p("progress nya baru sampai sini, belum bisa real time baru sekitar dua detik buat generate nya, harus butuh data lebih banyak, sekitar dua kali lebih banyak,")

7. Initialize Synthesizer

Initialize the TTS synthesizer with the downloaded checkpoint and configuration files:

synthesizer = Synthesizer(
    tts_checkpoint="checkpoint_1260000-inference.pth",
    tts_config_path="config.json",
    tts_speakers_file="speakers.pth"
).to(device)

8. Synthesize Speech

Generate the speech audio from the text:

wav = synthesizer.tts(text, speaker_name="wibowo")

9. Save the Audio File

Save the generated audio to a file:

synthesizer.save_wav(wav, "wibowo.wav")

Notes

Ensure the paths to the checkpoint, config, and speakers files are correctly specified.
Adjust the speaker_name parameter based on the available speakers in the speakers.pth file.
The synthesized audio will be saved as wibowo.wav in the specified directory.

This completes the setup and usage guide for the Indonesian TTS system. For further customization and usage, refer to the official documentation of the TTS library and the G2P converter.