Steven Zheng's picture

Steven Zheng

Steveeeeeeen

AI & ML interests

speech & audio

Recent Activity

Organizations

Hugging Face's profile picture The LLM Course's profile picture Hugging Test Lab's profile picture Whisper Distillation's profile picture Hugging Face OSS Metrics's profile picture Dynamic-SUPERB's profile picture Dynamic-SUPERB-Private's profile picture Hugging Face for Audio's profile picture huggingPartyParis's profile picture MLX Community's profile picture TTS AGI's profile picture Whisper Multilingual Distillation's profile picture Audio Collabs's profile picture open/ acc's profile picture MultiLlasa's profile picture fluxions-hf's profile picture Transformers Community's profile picture nvidia-hf-collab's profile picture

Steveeeeeeen's activity

reacted to merve's post with 🔥 12 days ago
view post
Post
3114
what happened in open AI past week? so many vision LM & omni releases 🔥 merve/releases-23-may-68343cb970bbc359f9b5fb05

multimodal 💬🖼️
> new moondream (VLM) is out: it's 4-bit quantized (with QAT) version of moondream-2b, runs on 2.5GB VRAM at 184 tps with only 0.6% drop in accuracy (OS) 🌚
> ByteDance released BAGEL-7B, an omni model that understands and generates both image + text. they also released Dolphin, a document parsing VLM 🐬 (OS)
> Google DeepMind dropped MedGemma in I/O, VLM that can interpret medical scans, and Gemma 3n, an omni model with competitive LLM performance

> MMaDa is a new 8B diffusion language model that can generate image and text



LLMs
> Mistral released Devstral, a 24B coding assistant (OS) 👩🏻‍💻
> Fairy R1-32B is a new reasoning model -- distilled version of DeepSeek-R1-Distill-Qwen-32B (OS)
> NVIDIA released ACEReason-Nemotron-14B, new 14B math and code reasoning model
> sarvam-m is a new Indic LM with hybrid thinking mode, based on Mistral Small (OS)
> samhitika-0.0.1 is a new Sanskrit corpus (BookCorpus translated with Gemma3-27B)

image generation 🎨
> MTVCrafter is a new human motion animation generator
  • 1 reply
·
New activity in Exgc/OmniSep_VGGSOUND_eval 18 days ago

Add dataset card

#1 opened 18 days ago by
Steveeeeeeen
upvoted an article 18 days ago
view article
Article

NVIDIA Cosmos Now Available On Hugging Face For Physical AI Reasoning

By PranjaliJoshi and 1 other
24