PapaRazi/Ijazah_Palsu_V2 ยท ๐ฎ๐ฉ Indonesian TTS Model (F5-TTS)
Ijazah_Palsu_V2 is a fine-tuned Indonesian speech synthesis model based on F5-TTS.
It was trained using a custom-curated dataset called PapaRazi/id-tts-v2
, focusing on natural and expressive Indonesian speech generation.
๐ง Model Details
- Base Framework: F5-TTS
- Training Time: ~3 days
- Dataset Size: ~70,000 samples (70 hours)
- Languages:
- Bahasa Indonesia (95%)
- English (5%) (limited English quality due to small dataset size)
- License: Non-commercial use only
- Author: [PapaRazi] (https://huggingface.co/PapaRazi) / (https://github.com/adigayung)
๐ Training Configuration
{
"exp_name": "F5TTS_v1_Base",
"learning_rate": 1e-05,
"batch_size_per_gpu": 1700,
"batch_size_type": "frame",
"max_samples": 64,
"grad_accumulation_steps": 1,
"max_grad_norm": 1,
"epochs": 34,
"num_warmup_updates": 7000,
"save_per_updates": 15000,
"keep_last_n_checkpoints": 7,
"last_per_updates": 15000,
"finetune": true,
"file_checkpoint_train": "",
"tokenizer_type": "char",
"tokenizer_file": "",
"mixed_precision": "fp16",
"logger": "tensorboard",
"bnb_optimizer": false
}
๐ฆ Dataset The dataset used for training is called PapaRazi/id-tts-v2, consisting of curated and cleaned audio-text pairs in Bahasa Indonesia. All preprocessing, splitting, and cleaning was done using a custom tool I developed: ๐ง whisper-tools
The default dataset splitter from F5-TTS produced inconsistent results (clips that were too short or way too long), so I built a custom pipeline to ensure clean, consistent samples.
๐ Audio Samples
๐ฃ Natural Sentence
"Suatu hari nanti, suara ini mungkin tidak bisa dibedakan lagi dari suara manusia asli."
๐ง Listen on vocaroo
๐ข Number Pronunciation (simple format)
"Serius?! Tiket konsernya habis dalam waktu 3 menit?!"
๐ง Listen on vocaroo
๐ธ Number Hallucination (millions format โ still imperfect)
"Masa cuma buat beli kursi kantor aja harus bayar Rp 2.500.000,-?! Gila sih itu!"
๐ง Listen on vocaroo โ ๏ธ Reading large numbers (like millions) is still inaccurate due to limited examples in the training dataset.
๐ค License & Usage This model is released for non-commercial use only. Feel free to explore, fine-tune, or give feedback!
Model tree for PapaRazi/Ijazah_Palsu_V2
Base model
SWivid/F5-TTS