Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2309.03199

Papers - Flow Matching

Movie Gen: A Cast of Media Foundation Models

Paper • 2410.13720 • Published 27 days ago • 87
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching

Paper • 2410.06885 • Published Oct 9 • 40
Flow Matching for Generative Modeling

Paper • 2210.02747 • Published Oct 6, 2022 • 1
Matcha-TTS: A fast TTS architecture with conditional flow matching

Paper • 2309.03199 • Published Sep 6, 2023 • 11

a collection of text to speech papers.

BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data

Paper • 2402.08093 • Published Feb 12 • 54
E3 TTS: Easy End-to-End Diffusion-based Text to Speech

Paper • 2311.00945 • Published Nov 2, 2023 • 14
Matcha-TTS: A fast TTS architecture with conditional flow matching

Paper • 2309.03199 • Published Sep 6, 2023 • 11
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions

Paper • 1712.05884 • Published Dec 16, 2017 • 2

Papers - Audio - TTS

Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions

Paper • 1712.05884 • Published Dec 16, 2017 • 2
VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild

Paper • 2403.16973 • Published Mar 25 • 2
High Fidelity Neural Audio Compression

Paper • 2210.13438 • Published Oct 24, 2022 • 3
RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis

Paper • 2404.03204 • Published Apr 4 • 7

Making Flow-Matching-Based Zero-Shot Text-to-Speech Laugh as You Like

Paper • 2402.07383 • Published Feb 12 • 13
Matcha-TTS: A fast TTS architecture with conditional flow matching

Paper • 2309.03199 • Published Sep 6, 2023 • 11
Natural language guidance of high-fidelity text-to-speech with synthetic annotations

Paper • 2402.01912 • Published Feb 2 • 11
Fast Timing-Conditioned Latent Audio Diffusion

Paper • 2402.04825 • Published Feb 7 • 7

Matcha-TTS: A fast TTS architecture with conditional flow matching

Paper • 2309.03199 • Published Sep 6, 2023 • 11
E3 TTS: Easy End-to-End Diffusion-based Text to Speech

Paper • 2311.00945 • Published Nov 2, 2023 • 14
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling

Paper • 2311.00430 • Published Nov 1, 2023 • 57
coqui/XTTS-v2

Text-to-Speech • Updated Dec 11, 2023 • 1.42M • 1.95k

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs