CloneTTS - Text-to-Speech Model
CloneTTS is a Text-to-Speech (TTS) model trained on the Clone dataset. The model converts text input into natural-sounding speech and is built to facilitate speech synthesis tasks. It uses the Clone dataset for training, which includes transcriptions and corresponding audio files.
License
This model is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). You are free to share, adapt, and use the model for any purpose, including commercial uses, as long as appropriate credit is given.
Model Overview
- Input: Text data.
- Output:
.wav
audio files (speech). - Task: Text-to-speech (TTS) conversion.
Features
- Convert text to high-quality, natural-sounding speech.
- Trained using the Clone dataset, designed to improve the quality of generated speech.
Dataset Overview
This model is trained on the Clone dataset, which consists of:
- Audio files:
.wav
format. - Transcriptions: Corresponding text transcriptions for each audio file.
- Format: A CSV file that pairs audio file paths with their corresponding text.
File Structure
data/
: Contains the audio files and thetranscriptions.csv
file used to train the model.model/
: Contains the trained model files, includingmodel_weights.h5
andmodel_config.json
.notebooks/
: Contains Jupyter notebooks for experimenting with the model and performing inference.requirements.txt
: A list of required libraries and dependencies for running the model.train.py
: Script to train the model on your dataset.
Installation
To use this model, follow the instructions below to clone the repository and install dependencies.
- Clone the repository:
git clone https://github.com/your_username/CloneTTS.git
cd CloneTTS
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support