Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Shih-Lun Wu's picture
1 6

Shih-Lun Wu

slseanwu
21world's profile picture
·
https://slseanwu.github.io
  • slseanwu
  • slSeanWU

AI & ML interests

Deep Music Generation, Speech Processing

Organizations

ESPnet's profile picture AIR2's profile picture mia-musgen's profile picture

slseanwu's activity

upvoted 6 papers over 1 year ago

BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data

Paper • 2402.08093 • Published Feb 12, 2024 • 62

DITTO: Diffusion Inference-Time T-Optimization for Music Generation

Paper • 2401.12179 • Published Jan 22, 2024 • 22

EDMSound: Spectrogram Based Diffusion Models for Efficient and High-Quality Audio Synthesis

Paper • 2311.08667 • Published Nov 15, 2023 • 19

Story-to-Motion: Synthesizing Infinite and Controllable Character Animation from Long Text

Paper • 2311.07446 • Published Nov 13, 2023 • 29

Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models

Paper • 2311.06783 • Published Nov 12, 2023 • 28

Music ControlNet: Multiple Time-varying Controls for Music Generation

Paper • 2311.07069 • Published Nov 13, 2023 • 45
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs