About
This is a basic zero-shot voice conversion model trained with VITS + contentvec
See:
https://github.com/alphacep/vosk-tts/tree/master/vc
https://github.com/quickvc/QuickVC-VoiceConversion
https://github.com/auspicious3000/contentvec
Speaker Similarity
Computed with eval.py with Resemblyzer
Original QuickVC (trained on VCTK) Average: 0.667 Min: 0.477
New model Average: 0.880 Min: 0.712
- Downloads last month
- 1
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
HF Inference deployability: The HF Inference API does not support audio-to-audio models for transformers
library.