FIRE: A Dataset for Feedback Integration and Refinement Evaluation of Multimodal Models Paper • 2407.11522 • Published Jul 16 • 8
SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding Paper • 2401.09340 • Published Jan 17 • 19