9 2 1

Xuenan Xu

wsntxxn

https://wsntxxn.github.io

AI & ML interests

Text to Speech Synthesis Text to Music Synthesis Singing Voice Synthesis

Recent Activity

upvoted a paper about 1 month ago

Intern-S1: A Scientific Multimodal Foundation Model

updated a model about 2 months ago

wsntxxn/effb2-trm-clotho-captioning

updated a model about 2 months ago

wsntxxn/effb2-trm-audiocaps-captioning

View all activity

Organizations

None yet

Generate a storytelling video from a topic and scene

Running

Efficient Audio Captioning

🔊

Generate captions from audio files

models 8

wsntxxn/effb2-trm-clotho-captioning

Feature Extraction • 0.0B • Updated Jul 28 • 45 • 1

wsntxxn/effb2-trm-audiocaps-captioning

Feature Extraction • 0.0B • Updated Jul 28 • 231 • 1

wsntxxn/cnn8rnn-laionclap-audiocapsv2-grounding

Audio Classification • 0.1B • Updated Jul 1 • 43 • 1

wsntxxn/cnn8rnn-audioset-sed

Audio Classification • 0.0B • Updated Dec 30, 2024 • 32 • 3

wsntxxn/cnn14rnn-tempgru-audiocaps-captioning

Feature Extraction • 0.1B • Updated Dec 27, 2024 • 22 • 1

wsntxxn/cnn8rnn-w2vmean-audiocaps-grounding

Audio Classification • 0.0B • Updated Aug 19, 2024 • 874 • 2

wsntxxn/audiocaps-simple-tokenizer

Updated Jun 19, 2024

wsntxxn/clotho-simple-tokenizer

Updated Jun 19, 2024

datasets 0

None public yet

Xuenan Xu

AI & ML interests

Recent Activity

Organizations

Papers 10

spaces 2 Sort: Recently updated

MM StoryAgent

Efficient Audio Captioning

models 8 Sort: Recently updated

datasets 0

spaces 2

models 8