Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
markredito 's Collections
Image Generation
LLMs
Audio
Interpretability
Multimodal
Music Generation
experiments
robotics
3D

Audio

updated Jun 16, 2024
Upvote
-

  • Retrieval-Augmented Text-to-Audio Generation

    Paper • 2309.08051 • Published Sep 14, 2023 • 7

  • A Large-scale Dataset for Audio-Language Representation Learning

    Paper • 2309.11500 • Published Sep 20, 2023 • 10

  • End-to-End Speech Recognition Contextualization with Large Language Models

    Paper • 2309.10917 • Published Sep 19, 2023 • 10

  • FoleyGen: Visually-Guided Audio Generation

    Paper • 2309.10537 • Published Sep 19, 2023 • 9

  • StemGen: A music generation model that listens

    Paper • 2312.08723 • Published Dec 14, 2023 • 49

  • ChatMusician: Understanding and Generating Music Intrinsically with LLM

    Paper • 2402.16153 • Published Feb 25, 2024 • 61

  • LLM-AD: Large Language Model based Audio Description System

    Paper • 2405.00983 • Published May 2, 2024 • 23

  • Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning

    Paper • 2405.18386 • Published May 28, 2024 • 23
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs