Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
abdar1925 's Collections
Computer Use Models
Vision models
Reasoning Models
Code Models
Document models
Omni-models
Audio models
Papers
Models
Datasets
Embd Models

Audio models

updated 14 days ago
Upvote
-

  • kyutai/moshika-vis-pytorch-bf16

    Updated Mar 22 • 56

  • sesame/csm-1b

    Text-to-Speech • Updated 16 days ago • 44.2k • 2.08k

  • kyutai/mimi

    Feature Extraction • Updated Sep 18, 2024 • 509k • • 211

  • kyutai/moshiko-pytorch-bf16

    Updated Sep 18, 2024 • 178k • 180

  • nvidia/canary-1b-flash

    Automatic Speech Recognition • Updated Mar 18 • 332k • 206

  • canopylabs/orpheus-3b-0.1-ft

    Text-to-Speech • Updated May 6 • 25.1k • • 573

  • stepfun-ai/Step-Audio-Chat

    Audio-Text-to-Text • Updated Feb 17 • 114 • 442

  • Zyphra/Zonos-v0.1-hybrid

    Text-to-Speech • Updated 9 days ago • 10.8k • 1.08k

  • hexgrad/Kokoro-82M

    Text-to-Speech • Updated Apr 10 • 1.81M • • 4.5k

  • Qwen/Qwen2.5-Omni-7B

    Any-to-Any • Updated Apr 30 • 391k • 1.66k

  • ChatAnyone: Stylized Real-time Portrait Video Generation with Hierarchical Motion Diffusion Model

    Paper • 2503.21144 • Published Mar 27 • 25

  • nari-labs/Dia-1.6B

    Text-to-Speech • Updated 11 days ago • 174k • • 2.52k

  • nvidia/parakeet-tdt-0.6b-v2

    Automatic Speech Recognition • Updated 21 days ago • 883k • 1.13k

  • ResembleAI/chatterbox

    Text-to-Speech • Updated 13 days ago • • 757
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs