Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens Paper • 2506.17218 • Published 7 days ago • 17
4D-LRM: Large Space-Time Reconstruction Model From and To Any View at Any Time Paper • 2506.18890 • Published 4 days ago • 4
VMem: Consistent Interactive Video Scene Generation with Surfel-Indexed View Memory Paper • 2506.18903 • Published 4 days ago • 17
Light of Normals: Unified Feature Representation for Universal Photometric Stereo Paper • 2506.18882 • Published 4 days ago • 80
view article Article Accelerating AI for Drug Discovery: Ginkgo’s GDPx Functional Genomics and GDPa Antibody Developability Dataset Series By cgeorgiaw and 1 other • 4 days ago • 11
4Real-Video-V2: Fused View-Time Attention and Feedforward Reconstruction for 4D Scene Generation Paper • 2506.18839 • Published 9 days ago • 9
Dimension-Reduction Attack! Video Generative Models are Experts on Controllable Image Synthesis Paper • 2505.23325 • Published 30 days ago • 2
Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations Paper • 2506.18898 • Published 4 days ago • 23
Hunyuan-GameCraft: High-dynamic Interactive Game Video Generation with Hybrid History Condition Paper • 2506.17201 • Published 7 days ago • 43
SongBloom: Coherent Song Generation via Interleaved Autoregressive Sketching and Diffusion Refinement Paper • 2506.07634 • Published 19 days ago • 2
Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence Paper • 2506.15677 • Published 9 days ago • 23
ImmerseGen: Agent-Guided Immersive World Generation with Alpha-Textured Proxies Paper • 2506.14315 • Published 11 days ago • 10
EmbodiedGen: Towards a Generative 3D World Engine for Embodied Intelligence Paper • 2506.10600 • Published 16 days ago • 7
Cosmos-Predict2 Collection World Foundation Model for Future Prediction • 11 items • Updated 40 minutes ago • 13
view article Article Introducing Training Cluster as a Service - a new collaboration with NVIDIA By jeffboudier and 2 others • 17 days ago • 23
SpatialLM: Training Large Language Models for Structured Indoor Modeling Paper • 2506.07491 • Published 19 days ago • 38