Ivan Tang's picture

2 8 7

Ivan Tang

IvanTang

·

Ivan_Tang_3D

AI & ML interests

Multimodal,3D,PEFT,LLM&MLLM

Recent Activity

authored a paper 23 days ago

OpenFly: A Versatile Toolchain and Large-scale Benchmark for Aerial Vision-Language Navigation

authored a paper 23 days ago

FreeGaussian: Annotation-free Controllable 3D Gaussian Splats with Flow Derivatives

authored a paper 23 days ago

AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool Use

View all activity

Organizations

None yet

IvanTang's activity

upvoted a paper 23 days ago

AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool Use

Paper • 2505.12650 • Published 26 days ago • 7

upvoted 3 papers 4 months ago

Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step

Paper • 2501.13926 • Published Jan 23 • 42

SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model

Paper • 2501.15830 • Published Jan 27 • 14

Exploring the Potential of Encoder-free Architectures in 3D LMMs

Paper • 2502.09620 • Published Feb 13 • 26

upvoted a paper 5 months ago

PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions

Paper • 2409.15278 • Published Sep 23, 2024 • 26

upvoted a paper 10 months ago

SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable Manners

Paper • 2408.16768 • Published Aug 29, 2024 • 29

upvoted a paper about 1 year ago

MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?

Paper • 2403.14624 • Published Mar 21, 2024 • 54

upvoted a paper over 1 year ago

SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models

Paper • 2311.07575 • Published Nov 13, 2023 • 15