Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations Paper • 2506.18898 • Published 19 days ago • 30
Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation Paper • 2506.09350 • Published Jun 11 • 48
UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based Mobile GUI Agents Paper • 2505.21496 • Published May 27 • 39
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT Paper • 2505.00703 • Published May 1 • 44
MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code Paper • 2410.08196 • Published Oct 10, 2024 • 48
CameraCtrl: Enabling Camera Control for Text-to-Video Generation Paper • 2404.02101 • Published Apr 2, 2024 • 25
ZeroGPU Spaces Collection ZeroGPU Spaces made by the community • 17 items • Updated Jun 6, 2024 • 242