questions
hey man i have problem , first the base model come like this :
F5TTS_Base
---model_1200000.pt
and in the second folder like this :
F5TTS_v1_Base
----model_1250000.safetensors
------vocab.txt
so you upload 3 models as .pt type:
model_190000.pt
model_235000.pt
model_380000.pt
my question:
1-which one should be used and where is the safetensors one,
because when i use the F5-TTS it use (model_1250000.safetensors) and when i copy and past one of your models , it didn't work
2-i need to say i tried to remove the base model model_1200000.pt , and replaced it with one of yours , but every time i close the program and open it again it keep download the original base model
3- in interface , there is options for custom and it let you choose the path, i try it , i didn't work,
i have another questions , the original files are on my F disk , i clone it from github and i used minicound , the problem here the models stored in:
C:\Users.cache\huggingface\hub\models--SWivid--F5-TTS\snapshots\d6bd6c3c3ec65c0a3ef25a6d3d09658c5e2817fd\F5TTS_v1_Base
and not in my
F:\F5-TTS...
so is there way to make everything in one folder in F:\F5-TTS\
thanks.
you can use any of the three checkpoints uploaded here , use the vocab txt provided here also.
if you want to finetune the model , the code by default search for the 120000.py , so i advice you to hard code tie path , in the finetune_cli.py , then make sure there is no checkpoints exist in the ckpt folder. you can do something dirty like changing the name of the checkpoint provided here to model_1200000.pt so the model read it instead. make sure you use the configuration of f5 base not v1 , I did not try it with v1.Also change the experiment name
Hi, there.
Thank you for your model.
I have some question about this model. I tried with some data, it works not so well. So I think there maybe some error for my steps.
model : model_380000.pt
vocab: vocab.txt (from this hub)
ref text: ููุฐูุง ุณูุคูุงูู ุฌููููุฏู
ref audio:
gen text: ููุฐูุง ุณูุคูุงูู ุฌููููุฏู (same with the ref text)
I downloaded all these models, and using src/f5_tts/train/finetune_gradio.py
do the inference.
And the final generated audio is: (ref audio was 24kHz)
When I tried to using the same audio with 16kHz, the result is better but not so good as the files in your readme. like this:
Could you help me with this or could you using model_380000.pt to do the same inference?
Thanks a lot.
Hi, there.
Thank you for your model.I have some question about this model. I tried with some data, it works not so well. So I think there maybe some error for my steps.
model : model_380000.pt
vocab: vocab.txt (from this hub)
ref text: ููุฐูุง ุณูุคูุงูู ุฌููููุฏู
ref audio:gen text: ููุฐูุง ุณูุคูุงูู ุฌููููุฏู (same with the ref text)
I downloaded all these models, and using
src/f5_tts/train/finetune_gradio.py
do the inference.And the final generated audio is: (ref audio was 24kHz)
When I tried to using the same audio with 16kHz, the result is better but not so good as the files in your readme. like this:
Could you help me with this or could you using model_380000.pt to do the same inference?
Thanks a lot.
the model not working , i tried it and have same result
please try with notebook : https://colab.research.google.com/drive/1kX7HB05CouHa5A-4Wy0UPqMuW4APqDBr?usp=sharing