sm079/f5-tts-nepali-romanized-oslr43

Overview

The F5-TTS model is finetuned on romanized [High quality TTS data for Nepali] dataset for Nepali text to speech.

License

This model is released under the Creative Commons Attribution Non Commercial Share Alike 4.0 license, which allows for free usage, modification, and distribution.

Model Information

Base Model: SWivid/F5-TTS/tree/main/F5TTS_v1_Base
Training Duration: 110k steps
Dataset: High quality TTS data for Nepali

Training Configuration:

{
    "exp_name": "F5TTS_v1_Base",
    "learning_rate": 1e-05,
    "batch_size_per_gpu": 1533,
    "batch_size_type": "frame",
    "max_samples": 64,
    "grad_accumulation_steps": 1,
    "max_grad_norm": 1,
    "epochs": 1949,
    "num_warmup_updates": 103,
    "save_per_updates": 10000,
    "keep_last_n_checkpoints": 20,
    "last_per_updates": 5000,
    "finetune": true,
    "file_checkpoint_train": "",
    "tokenizer_type": "pinyin",
    "tokenizer_file": "",
    "mixed_precision": "bf16",
    "logger": "none",
    "bnb_optimizer": true
}

Usage Instructions

go to base repo

sm079
/

f5-tts-nepali-romanized-oslr43

Overview

License

Model Information

Training Configuration:

Usage Instructions

Other links

Model tree for sm079/f5-tts-nepali-romanized-oslr43