Audio-AGI (Audio-AGI)

haoheliu

authored a paper 2 months ago

YuE: Scaling Open Foundation Models for Long-Form Music Generation

Paper • 2503.08638 • Published Mar 11 • 66

haoheliu

authored a paper 3 months ago

Audio-FLAN: A Preliminary Release

Paper • 2502.16584 • Published Feb 23 • 37

Xubo-Liu

authored a paper 6 months ago

Scaling Transformers for Low-Bitrate High-Quality Speech Coding

Paper • 2411.19842 • Published Nov 29, 2024 • 12

haoheliu

authored a paper 10 months ago

Efficient Audio Captioning with Encoder-Level Knowledge Distillation

Paper • 2407.14329 • Published Jul 19, 2024 • 5

haoheliu

authored 2 papers about 1 year ago

SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound

Paper • 2405.00233 • Published Apr 30, 2024 • 18

FlashSpeech: Efficient Zero-Shot Speech Synthesis

Paper • 2404.14700 • Published Apr 23, 2024 • 33

zzk1st

authored a paper about 1 year ago

Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models

Paper • 2404.12387 • Published Apr 18, 2024 • 40

Xubo-Liu

updated a Space over 1 year ago

221

AudioSep

🐠

Xubo-Liu

authored a paper over 1 year ago

Retrieval-Augmented Text-to-Audio Generation

Paper • 2309.08051 • Published Sep 14, 2023 • 7

haoheliu

authored 2 papers over 1 year ago

Retrieval-Augmented Text-to-Audio Generation

Paper • 2309.08051 • Published Sep 14, 2023 • 7

AudioSR: Versatile Audio Super-resolution at Scale

Paper • 2309.07314 • Published Sep 13, 2023 • 28

Xubo-Liu

updated 2 Spaces over 1 year ago

192

WavJourney

🔥

README

😻

qiuqiangkong

authored a paper almost 2 years ago

AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining

Paper • 2308.05734 • Published Aug 10, 2023 • 37

Xubo-Liu

authored a paper almost 2 years ago

AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining

Paper • 2308.05734 • Published Aug 10, 2023 • 37

haoheliu

authored 2 papers almost 2 years ago

AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining

Paper • 2308.05734 • Published Aug 10, 2023 • 37

MusicLDM: Enhancing Novelty in Text-to-Music Generation Using Beat-Synchronous Mixup Strategies

Paper • 2308.01546 • Published Aug 3, 2023 • 18

Xubo-Liu

authored a paper almost 2 years ago

WavJourney: Compositional Audio Creation with Large Language Models

Paper • 2307.14335 • Published Jul 26, 2023 • 44

qiuqiangkong

authored a paper almost 2 years ago

WavJourney: Compositional Audio Creation with Large Language Models

Paper • 2307.14335 • Published Jul 26, 2023 • 44

JinhuaL1ANG

authored a paper almost 2 years ago

WavJourney: Compositional Audio Creation with Large Language Models

Paper • 2307.14335 • Published Jul 26, 2023 • 44

Audio-AGI

AI & ML interests

Audio-AGI's activity

YuE: Scaling Open Foundation Models for Long-Form Music Generation

Audio-FLAN: A Preliminary Release

Scaling Transformers for Low-Bitrate High-Quality Speech Coding

Efficient Audio Captioning with Encoder-Level Knowledge Distillation

SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound

FlashSpeech: Efficient Zero-Shot Speech Synthesis

Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models

AudioSep

Retrieval-Augmented Text-to-Audio Generation

Retrieval-Augmented Text-to-Audio Generation

AudioSR: Versatile Audio Super-resolution at Scale

WavJourney

README

AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining

AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining

AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining

MusicLDM: Enhancing Novelty in Text-to-Music Generation Using Beat-Synchronous Mixup Strategies

WavJourney: Compositional Audio Creation with Large Language Models

WavJourney: Compositional Audio Creation with Large Language Models

WavJourney: Compositional Audio Creation with Large Language Models

AI & ML interests

Team members 10

Audio-AGI's activity

AudioSep

WavJourney

README