BlenderFusion: 3D-Grounded Visual Editing and Generative Compositing Paper • 2506.17450 • Published 10 days ago • 46
Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning Paper • 2506.13654 • Published 15 days ago • 42
LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias Paper • 2410.17242 • Published Oct 22, 2024 • 5
MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks Paper • 2410.10563 • Published Oct 14, 2024 • 39
MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks Paper • 2410.10563 • Published Oct 14, 2024 • 39