Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation Paper • 2502.20388 • Published Feb 27 • 16
FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching Paper • 2412.15205 • Published Dec 19, 2024
Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens Paper • 2501.07730 • Published Jan 13 • 17
Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP Paper • 2308.02487 • Published Aug 4, 2023 • 13
Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models Paper • 2406.09416 • Published Jun 13, 2024 • 30
LLaVolta: Efficient Multi-modal Models via Stage-wise Visual Context Compression Paper • 2406.20092 • Published Jun 28, 2024