Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens Paper • 2506.17218 • Published 7 days ago • 17
4D-LRM: Large Space-Time Reconstruction Model From and To Any View at Any Time Paper • 2506.18890 • Published 4 days ago • 4
VMem: Consistent Interactive Video Scene Generation with Surfel-Indexed View Memory Paper • 2506.18903 • Published 4 days ago • 17
Light of Normals: Unified Feature Representation for Universal Photometric Stereo Paper • 2506.18882 • Published 4 days ago • 80
view article Article Accelerating AI for Drug Discovery: Ginkgo’s GDPx Functional Genomics and GDPa Antibody Developability Dataset Series By cgeorgiaw and 1 other • 4 days ago • 11
4Real-Video-V2: Fused View-Time Attention and Feedforward Reconstruction for 4D Scene Generation Paper • 2506.18839 • Published 9 days ago • 9
Dimension-Reduction Attack! Video Generative Models are Experts on Controllable Image Synthesis Paper • 2505.23325 • Published 30 days ago • 2
Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations Paper • 2506.18898 • Published 4 days ago • 23
Hunyuan-GameCraft: High-dynamic Interactive Game Video Generation with Hybrid History Condition Paper • 2506.17201 • Published 7 days ago • 43
SongBloom: Coherent Song Generation via Interleaved Autoregressive Sketching and Diffusion Refinement Paper • 2506.07634 • Published 19 days ago • 2