Steven Zheng's picture

Building on HF

Steven Zheng PRO

Steveeeeeeen

huggingface

·

AI & ML interests

speech & audio

Recent Activity

updated a dataset 3 days ago

Steveeeeeeen/whisper-leaderboard-evals

liked a model 13 days ago

google/medasr

published a model 19 days ago

Steveeeeeeen/voxtral-spanish-asr

View all activity

Organizations

upvoted a changelog 28 days ago

Changelog

Team & Enterprise Articles Now Featured on the Hugging Face Blog

29 days ago

• 77

upvoted an article about 1 month ago

Article

Curating datasets directly on the Hub

Nov 27, 2025

•

22

upvoted a collection about 1 month ago

Step-Audio-EditX

Step-Audio-EditX • 4 items • Updated Nov 19, 2025 • 10

upvoted an article about 1 month ago

Article

Continuous batching from first principles

+1

Nov 25, 2025

•

296

upvoted an article about 2 months ago

Article

Open ASR Leaderboard: Trends and Insights with New Multilingual & Long-Form Tracks

+2

Nov 21, 2025

•

24

upvoted 3 papers about 2 months ago

MOSS-Speech: Towards True Speech-to-Speech Models Without Text Guidance

Paper • 2510.00499 • Published Oct 1, 2025 • 19

Drax: Speech Recognition with Discrete Flow Matching

Paper • 2510.04162 • Published Oct 5, 2025 • 27

Treble10: A high-quality dataset for far-field speech recognition, dereverberation, and enhancement

Paper • 2510.23141 • Published Oct 27, 2025 • 4

upvoted 7 articles about 2 months ago

Article

Voice Cloning with Consent

Oct 28, 2025

•

34

Article

Introducing Cogito v2.1

Nov 19, 2025

•

17

Article

Granite 4.0 Nano: Just how small can you go?

Oct 28, 2025

•

121

Article

Aligning to What? Rethinking Agent Generalization in MiniMax M2

Oct 30, 2025

•

41

Article

Join the AMD Open Robotics Hackathon

Nov 13, 2025

•

12

Article

AI Model Optimization More Flexible Than Ever

Nov 17, 2025

•

13

Article

Apriel-H1: The Surprising Key to Distilling Efficient Reasoning Models

Nov 19, 2025

•

33

upvoted a paper about 2 months ago

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published Nov 9, 2025 • 132

upvoted an article 2 months ago

Article

Llasa Goes RL: Training LLaSA with GRPO for Improved Prosody and Expressiveness

Nov 5, 2025

•

10

upvoted a paper 2 months ago

Open ASR Leaderboard: Towards Reproducible and Transparent Multilingual and Long-Form Speech Recognition Evaluation

Paper • 2510.06961 • Published Oct 8, 2025 • 10

upvoted a paper 3 months ago

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM

Paper • 2510.15870 • Published Oct 17, 2025 • 89

upvoted an article 3 months ago

Article

High-Quality Datasets for Far-Field ASR (Treble Technologies x Hugging Face)

Oct 13, 2025

•

16