EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling Paper • 2502.09509 • Published Feb 13 • 8
YOLOv12: Attention-Centric Real-Time Object Detectors Paper • 2502.12524 • Published Feb 18 • 12
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published Feb 20 • 146
ObjectMover: Generative Object Movement with Video Prior Paper • 2503.08037 • Published Mar 11 • 4
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models Paper • 2503.09573 • Published Mar 12 • 73
RWKV-7 "Goose" with Expressive Dynamic State Evolution Paper • 2503.14456 • Published Mar 18 • 153
TransMamba: Flexibly Switching between Transformer and Mamba Paper • 2503.24067 • Published Mar 31 • 21
Softpick: No Attention Sink, No Massive Activations with Rectified Softmax Paper • 2504.20966 • Published Apr 29 • 32
Radial Attention: O(nlog n) Sparse Attention with Energy Decay for Long Video Generation Paper • 2506.19852 • Published Jun 24 • 41
Representing Speech Through Autoregressive Prediction of Cochlear Tokens Paper • 2508.11598 • Published 10 days ago • 16