OpenFly: A Versatile Toolchain and Large-scale Benchmark for Aerial Vision-Language Navigation Paper • 2502.18041 • Published Feb 25 • 1
FreeGaussian: Annotation-free Controllable 3D Gaussian Splats with Flow Derivatives Paper • 2410.22070 • Published Oct 29, 2024
AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool Use Paper • 2505.12650 • Published 26 days ago • 7
AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool Use Paper • 2505.12650 • Published 26 days ago • 7
AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool Use Paper • 2505.12650 • Published 26 days ago • 7 • 2
Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step Paper • 2501.13926 • Published Jan 23 • 42
SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model Paper • 2501.15830 • Published Jan 27 • 14
ViewRefer: Grasp the Multi-view Knowledge for 3D Visual Grounding with GPT and Prototype Guidance Paper • 2303.16894 • Published Mar 29, 2023 • 1
Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following Paper • 2309.00615 • Published Sep 1, 2023 • 13
Point-PEFT: Parameter-Efficient Fine-Tuning for 3D Pre-trained Models Paper • 2310.03059 • Published Oct 4, 2023
Any2Point: Empowering Any-modality Large Models for Efficient 3D Understanding Paper • 2404.07989 • Published Apr 11, 2024
Exploring the Potential of Encoder-free Architectures in 3D LMMs Paper • 2502.09620 • Published Feb 13 • 26
Exploring the Potential of Encoder-free Architectures in 3D LMMs Paper • 2502.09620 • Published Feb 13 • 26