what does this model do?

by MayensGuds - opened May 28

Discussion

MayensGuds

May 28

title

YaTharThShaRma999

May 28

@MayensGuds its a tts model, so converts text to speech.
It's pretty realistic imo and can even laugh and pause, later models might have even more emotions. However, it does not have voice cloning or prompting like some other tts models. Not the best tts possible but pretty decent.

Sam-786

Jun 7

I was looking for same model to integrate at my site https://www.tiroalpalotv.es but in spanish. Do this model also work for espanòl language? and is it mobile responsive?

YaTharThShaRma999

Jun 7

@Sam-786
No, this not for Spanish, only for Chinese and English I believe.

It is also quite slow, you need a decent gpu for real-time inference but this is far far slower then real-time on a mobile.

I would recommend xtts v2 if you can run it on a gpu(doesn’t need to be decent or high end, 4gb vram is probably fine).

If you are running in mobile, I suppose openvoice v2 is a possibility.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment