Less is More: Recursive Reasoning with Tiny Networks Paper • 2510.04871 • Published Oct 6, 2025 • 501
VibeVoice Collection Frontier Text-to-Speech Models https://microsoft.github.io/VibeVoice/ • 8 items • Updated Dec 4, 2025 • 184
ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing Paper • 2506.21448 • Published Jun 26, 2025 • 8
Seedance 1.0: Exploring the Boundaries of Video Generation Models Paper • 2506.09113 • Published Jun 10, 2025 • 105
Deep Video Discovery: Agentic Search with Tool Use for Long-form Video Understanding Paper • 2505.18079 • Published May 23, 2025 • 5
Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation Paper • 2504.02542 • Published Apr 3, 2025 • 51
InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity Paper • 2503.16418 • Published Mar 20, 2025 • 36
Phi-4 Collection Phi-4 family of small language, multi-modal and reasoning models. • 17 items • Updated Jul 10, 2025 • 192
Cosmos Collection ⚠️ This collection is archived. 👉 https://huggingface.co/collections/nvidia/nvidia-cosmos-2 • 31 items • Updated 2 days ago • 299