CFG-Zero*: Improved Classifier-Free Guidance for Flow Matching Models Paper • 2503.18886 • Published about 17 hours ago • 7
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey Paper • 2503.12605 • Published 9 days ago • 28
EgoLife Collection CVPR 2025 - EgoLife: Towards Egocentric Life Assistant. Homepage: https://egolife-ai.github.io/ • 10 items • Updated 18 days ago • 16
Ola: Pushing the Frontiers of Omni-Modal Language Model with Progressive Modality Alignment Paper • 2502.04328 • Published Feb 6 • 30
Video-MMMU: Evaluating Knowledge Acquisition from Multi-Discipline Professional Videos Paper • 2501.13826 • Published Jan 23 • 25
Video-MMMU: Evaluating Knowledge Acquisition from Multi-Discipline Professional Videos Paper • 2501.13826 • Published Jan 23 • 25
CityDreamer4D: Compositional Generative Model of Unbounded 4D Cities Paper • 2501.08983 • Published Jan 15 • 20
RepVideo: Rethinking Cross-Layer Representation for Video Generation Paper • 2501.08994 • Published Jan 15 • 15