VLM-3R: Vision-Language Models Augmented with Instruction-Aligned 3D Reconstruction Paper • 2505.20279 • Published May 26 • 4