Text-to-Speech
Safetensors
inf5
custom_code

Need of faster backend.

#5
by Venkatesh4342 - opened

could anyone please provide the resources to inference on faster backend, currently it is taking 15 to 20 seconds for text with approx 10-15 words.

@Venkatesh4342 Good day. Could you please provide the code necessary to load and execute this offline, without relying on a Hugging Face repository ID? Thank you.

You can try out the IndicF5 model by cloning the repo and running the code provided in the Hugging Face model card. Here's how:

git clone https://github.com/AI4Bharat/IndicF5.git
cd IndicF5
pip install -r requirements.txt

Then, create a Python script and copy the usage example from the model card to run inference.
Let me know if you run into any issues!

@Venkatesh4342 thanks for your prompt reply, but i have tried this and is working in online mode, every time i run the inference it is hitting the hugging face, i want it to run on my local system, what i observed is the model won't load even if i provide the local path, it still treats the path as hugging face repo_id. Maybe i missed something,,,, thats where i am confused...

Downloading the model and vocab file into local folder and using that to run with T5-TTS was working but audio clarity was not same as using with huggingface might be you can try tweaking some parameters to fix it.
https://github.com/SWivid/F5-TTS?tab=readme-ov-file

Yes it generated but in my case it is giving me noise output instead of proper speech
Can you share me the code to my personal mail

[email protected]

on how you load the model, i tried inf5 class based loading, it seems they had defined custom class,
With Auto model also i tried but not fruitful.

Sign up or log in to comment