
Releases
Model | Published | Training Data | Langs & Voices | SHA256 |
---|---|---|---|---|
v1.0 | 2025 Jan 27 | Few hundred hrs | 8 & 54 | 496dba11 |
[v0.19] | 2024 Dec 25 | <100 hrs | 1 & 10 | 3b0c392f |
Training Costs | v0.19 | v1.0 | Total |
---|---|---|---|
in A100 80GB GPU hours | 500 | 500 | 1000 |
average hourly rate | $0.80/h | $1.20/h | $1/h |
in USD | $400 | $600 | $1000 |
Usage
You can run this basic cell on Google Colab. Listen to samples. For more languages and details, see Advanced Usage.
!pip install -q kokoro>=0.9.2 soundfile
!apt-get -qq -y install espeak-ng > /dev/null 2>&1
from kokoro import KPipeline
from IPython.display import display, Audio
import soundfile as sf
import torch
pipeline = KPipeline(lang_code='a')
text = '''
Qhash is an open-weight TTS model with 84 million parameters. Despite its lightweight architecture, it delivers comparable quality to larger models while being significantly faster and more cost-efficient. With Apache-licensed weights, Qhash-TTS can be deployed anywhere from production environments to personal projects.
'''
generator = pipeline(text, voice='af_heart')
for i, (gs, ps, audio) in enumerate(generator):
print(i, gs, ps)
display(Audio(data=audio, rate=24000, autoplay=i==0))
sf.write(f'{i}.wav', audio, 24000)
Under the hood, Qhash-TTS
uses misaki
, a G2P library at https://github.com/hexgrad/misaki
- Downloads last month
- 21
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support