NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale Paper • 2508.10711 • Published 9 days ago • 134
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization Paper • 2507.14683 • Published Jul 19 • 126
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 11 items • Updated Jul 21 • 528
Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations Paper • 2506.18898 • Published Jun 23 • 33
view article Article Enhance Your Models in 5 Minutes with the Hugging Face Kernel Hub By drbh and 6 others • Jun 12 • 125
Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation Paper • 2506.09350 • Published Jun 11 • 48
SeedVR Collection A diffusion transformer model for high-resolution image and video restoration. • 9 items • Updated 4 days ago • 7
SeedVR2: One-Step Video Restoration via Diffusion Adversarial Post-Training Paper • 2506.05301 • Published Jun 5 • 55
Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models Paper • 2505.10554 • Published May 15 • 120
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model Paper • 2504.08685 • Published Apr 11 • 130
FlashVideo:Flowing Fidelity to Detail for Efficient High-Resolution Video Generation Paper • 2502.05179 • Published Feb 7 • 24
SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration Paper • 2501.01320 • Published Jan 2 • 11
AIMv2 Collection A collection of AIMv2 vision encoders that supports a number of resolutions, native resolution, and a distilled checkpoint. • 19 items • Updated Nov 22, 2024 • 81
DiTFastAttn: Attention Compression for Diffusion Transformer Models Paper • 2406.08552 • Published Jun 12, 2024 • 26