Shih-Lun Wu's picture

1 6

Shih-Lun Wu

slseanwu

·

https://slseanwu.github.io

AI & ML interests

Deep Music Generation, Speech Processing

Organizations

upvoted 6 papers over 1 year ago

BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data

Paper • 2402.08093 • Published Feb 12, 2024 • 62

DITTO: Diffusion Inference-Time T-Optimization for Music Generation

Paper • 2401.12179 • Published Jan 22, 2024 • 22

EDMSound: Spectrogram Based Diffusion Models for Efficient and High-Quality Audio Synthesis

Paper • 2311.08667 • Published Nov 15, 2023 • 19

Story-to-Motion: Synthesizing Infinite and Controllable Character Animation from Long Text

Paper • 2311.07446 • Published Nov 13, 2023 • 29

Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models

Paper • 2311.06783 • Published Nov 12, 2023 • 28

Music ControlNet: Multiple Time-varying Controls for Music Generation

Paper • 2311.07069 • Published Nov 13, 2023 • 45