Overview

The F5-TTS model is finetuned on romanized [High quality TTS data for Nepali] dataset for Nepali text to speech.

License

This model is released under the Creative Commons Attribution Non Commercial Share Alike 4.0 license, which allows for free usage, modification, and distribution.

Model Information

Base Model: SWivid/F5-TTS/tree/main/F5TTS_v1_Base
Training Duration: 110k steps
Dataset: High quality TTS data for Nepali

Training Configuration:

{
    "exp_name": "F5TTS_v1_Base",
    "learning_rate": 1e-05,
    "batch_size_per_gpu": 1533,
    "batch_size_type": "frame",
    "max_samples": 64,
    "grad_accumulation_steps": 1,
    "max_grad_norm": 1,
    "epochs": 1949,
    "num_warmup_updates": 103,
    "save_per_updates": 10000,
    "keep_last_n_checkpoints": 20,
    "last_per_updates": 5000,
    "finetune": true,
    "file_checkpoint_train": "",
    "tokenizer_type": "pinyin",
    "tokenizer_file": "",
    "mixed_precision": "bf16",
    "logger": "none",
    "bnb_optimizer": true
}

Usage Instructions

go to base repo

Other links

scripts

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for sm079/f5-tts-nepali-romanized-oslr43

Base model

SWivid/F5-TTS
Finetuned
(54)
this model