A Simple Video Segmenter by Tracking Objects Along Axial Trajectories Paper • 2311.18537 • Published Nov 30, 2023
ReVision: High-Quality, Low-Cost Video Generation with Explicit 3D Physics Modeling for Complex Motion and Interaction Paper • 2504.21855 • Published 11 days ago • 12
Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation Paper • 2502.20388 • Published Feb 27 • 16
FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching Paper • 2412.15205 • Published Dec 19, 2024
Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens Paper • 2501.07730 • Published Jan 13 • 17
Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP Paper • 2308.02487 • Published Aug 4, 2023 • 13
Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models Paper • 2406.09416 • Published Jun 13, 2024 • 30
LLaVolta: Efficient Multi-modal Models via Stage-wise Visual Context Compression Paper • 2406.20092 • Published Jun 28, 2024