Fake it to make it: Using synthetic data to remedy the data shortage in joint multimodal speech-and-gesture synthesis Paper • 2404.19622 • Published Apr 30, 2024 • 2
Diffusion-Based Co-Speech Gesture Generation Using Joint Text and Audio Representation Paper • 2309.05455 • Published Sep 11, 2023
OverFlow: Putting flows on top of neural transducers for better TTS Paper • 2211.06892 • Published Nov 13, 2022
Neural HMMs are all you need (for high-quality attention-free TTS) Paper • 2108.13320 • Published Aug 30, 2021
Matcha-TTS: A fast TTS architecture with conditional flow matching Paper • 2309.03199 • Published Sep 6, 2023 • 12
Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis Paper • 2306.09417 • Published Jun 15, 2023 • 3