Overview
The F5-TTS model is finetuned on romanized [High quality TTS data for Nepali] dataset for Nepali text to speech.
License
This model is released under the Creative Commons Attribution Non Commercial Share Alike 4.0 license, which allows for free usage, modification, and distribution.
Model Information
Base Model: SWivid/F5-TTS/tree/main/F5TTS_v1_Base
Training Duration: 110k steps
Dataset: High quality TTS data for Nepali
Training Configuration:
{
"exp_name": "F5TTS_v1_Base",
"learning_rate": 1e-05,
"batch_size_per_gpu": 1533,
"batch_size_type": "frame",
"max_samples": 64,
"grad_accumulation_steps": 1,
"max_grad_norm": 1,
"epochs": 1949,
"num_warmup_updates": 103,
"save_per_updates": 10000,
"keep_last_n_checkpoints": 20,
"last_per_updates": 5000,
"finetune": true,
"file_checkpoint_train": "",
"tokenizer_type": "pinyin",
"tokenizer_file": "",
"mixed_precision": "bf16",
"logger": "none",
"bnb_optimizer": true
}
Usage Instructions
go to base repo
Other links
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for sm079/f5-tts-nepali-romanized-oslr43
Base model
SWivid/F5-TTS