To Meta AI Research: I would like to fold ylacombe/expresso into the training mix of an Apache TTS model series. Can you relax the Expresso dataset license to CC-BY or more permissive?
Barring that, can I have an individual exception to train on the materials and distribute trained Apache models, without direct redistribution of the original files? Thanks!
๐ฃ Looking for labeled, high-quality synthetic audio/TTS data ๐ฃ Have you been or are you currently calling API endpoints from OpenAI, ElevenLabs, etc? Do you have labeled audio data sitting around gathering dust? Let's talk! Join https://discord.gg/QuGxSWBfQy or comment down below.
If your data exceeds quantity & quality thresholds and is approved into the next hexgrad/Kokoro-82M training mix, and you permissively DM me the data under an effective Apache license, then I will DM back the corresponding voicepacks for YOUR data if/when the next Apache-licensed Kokoro base model drops.
What does this mean? If you've been calling closed-source TTS or audio API endpoints to: - Build voice agents - Make long-form audio, like audiobooks or podcasts - Handle customer support, etc Then YOU can contribute to the training mix and get useful artifacts in return. โค๏ธ