--- license: cc-by-nc-4.0 datasets: - mesolitica/Malaysian-Emilia language: - ms - en base_model: - SWivid/F5-TTS new_version: mesolitica/Malaysian-F5-TTS-v3 --- # Full Parameter Finetuning Malaysian Emilia F5-TTS v2 Continue training from [SWivid/F5-TTS](https://huggingface.co/SWivid/F5-TTS) `F5TTS_Base` checkpoint on [Malaysian-Emilia](https://huggingface.co/datasets/mesolitica/Malaysian-Emilia), with total 8472 hours included 600 hours Mandarin sampled from [amphion/Emilia-Dataset](https://huggingface.co/datasets/amphion/Emilia-Dataset). ## Features 1. This model should be able to zero-shot voice conversion any Malaysian and Singaporean speakers. 2. This model able to generate minimal filler sounds such as `erm`, `huh`, for example below, `Isu sekarangnya, erm, kita harus jadi yang terbaik untuk rakyat Malaysia, dan kita, uh, kena makan nasi lemak yang sedap lagi lazat, hah, penat nak kena cakap.` ## Checkpoints We uploaded full checkpoints with optimizer states at [checkpoints](checkpoints). ## Dataset We train on postfilter [Malaysian-Emilia](https://huggingface.co/datasets/mesolitica/Malaysian-Emilia) called [Malaysian-Voice-Conversion](https://huggingface.co/datasets/mesolitica/Malaysian-Voice-Conversion) ## Source code All source code at https://github.com/mesolitica/malaya-speech/tree/master/session/f5-tts