GigaWorld-Policy: An Efficient Action-Centered World--Action Model Paper • 2603.17240 • Published 4 days ago • 22
MosaicMem: Hybrid Spatial Memory for Controllable Video World Models Paper • 2603.17117 • Published 4 days ago • 82
MA-EgoQA: Question Answering over Egocentric Videos from Multiple Embodied Agents Paper • 2603.09827 • Published 11 days ago • 28
Helios: Real Real-Time Long Video Generation Model Paper • 2603.04379 • Published 17 days ago • 173
Beyond Language Modeling: An Exploration of Multimodal Pretraining Paper • 2603.03276 • Published 18 days ago • 99
Mode Seeking meets Mean Seeking for Fast Long Video Generation Paper • 2602.24289 • Published 22 days ago • 41
MolHIT: Advancing Molecular-Graph Generation with Hierarchical Discrete Diffusion Models Paper • 2602.17602 • Published about 1 month ago • 56
DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos Paper • 2602.06949 • Published Feb 6 • 36
THINKSAFE: Self-Generated Safety Alignment for Reasoning Models Paper • 2601.23143 • Published Jan 30 • 39
Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation Paper • 2601.00664 • Published Jan 2 • 57
Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders Paper • 2601.16208 • Published Jan 22 • 55
Cosmos Policy: Fine-Tuning Video Models for Visuomotor Control and Planning Paper • 2601.16163 • Published Jan 22 • 14
Goal Force: Teaching Video Models To Accomplish Physics-Conditioned Goals Paper • 2601.05848 • Published Jan 9 • 16
Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation Paper • 2601.00664 • Published Jan 2 • 57