2 8 7

Ivan Tang

IvanTang

Ivan_Tang_3D

AI & ML interests

Multimodal,3D,PEFT,LLM&MLLM

Recent Activity

authored a paper 23 days ago

OpenFly: A Versatile Toolchain and Large-scale Benchmark for Aerial Vision-Language Navigation

authored a paper 23 days ago

FreeGaussian: Annotation-free Controllable 3D Gaussian Splats with Flow Derivatives

authored a paper 23 days ago

AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool Use

View all activity

Organizations

None yet

IvanTang's activity

authored 3 papers 23 days ago

OpenFly: A Versatile Toolchain and Large-scale Benchmark for Aerial Vision-Language Navigation

Paper • 2502.18041 • Published Feb 25 • 1

FreeGaussian: Annotation-free Controllable 3D Gaussian Splats with Flow Derivatives

Paper • 2410.22070 • Published Oct 29, 2024

AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool Use

Paper • 2505.12650 • Published 26 days ago • 7

upvoted a paper 23 days ago

AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool Use

Paper • 2505.12650 • Published 26 days ago • 7

commented a paper 23 days ago

AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool Use

Paper • 2505.12650 • Published 26 days ago • 7 •

liked a model 29 days ago

IvanTang/ENEL

Image-Text-to-Text • Updated Feb 15 • 42 • 2

liked a model 2 months ago

agentica-org/DeepCoder-14B-Preview

Text Generation • Updated May 11 • 5.95k • 648

liked a dataset 3 months ago

ZiyuG/SciVerse

Viewer • Updated Sep 11, 2024 • 1.15k • 64 • 4

upvoted a paper 4 months ago

Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step

Paper • 2501.13926 • Published Jan 23 • 42

liked a Space 4 months ago

OpenX LeRobot Visualizer

📚

Visualization of OpenX dataset in LeRobot format

updated a model 4 months ago

IvanTang/ENEL

Image-Text-to-Text • Updated Feb 15 • 42 • 2

New activity in IvanTang/ENEL 4 months ago

Add model card

#1 opened 4 months ago by

nielsr

upvoted a paper 4 months ago

SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model

Paper • 2501.15830 • Published Jan 27 • 14

authored 5 papers 4 months ago

ViewRefer: Grasp the Multi-view Knowledge for 3D Visual Grounding with GPT and Prototype Guidance

Paper • 2303.16894 • Published Mar 29, 2023 • 1

Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following

Paper • 2309.00615 • Published Sep 1, 2023 • 13

upvoted a paper 4 months ago

Exploring the Potential of Encoder-free Architectures in 3D LMMs

Paper • 2502.09620 • Published Feb 13 • 26

published a model 4 months ago

IvanTang/ENEL

Image-Text-to-Text • Updated Feb 15 • 42 • 2