Zilong Huang's picture

6 13

Zilong Huang

speedinghzl

·

speedinghzl

AI & ML interests

None yet

Recent Activity

liked a dataset 10 days ago

yangjie-cv/WeThink_Multimodal_Reasoning_120K

liked a dataset 16 days ago

Lixsp11/Sekai-Project

liked a dataset 20 days ago

mlfoundations/MINT-1T-HTML

View all activity

Organizations

None yet

authored 10 papers about 2 months ago

CCNet: Criss-Cross Attention for Semantic Segmentation

Paper • 1811.11721 • Published Nov 28, 2018

BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs

Paper • 2307.08581 • Published Jul 17, 2023 • 28

Depth Anything V2

Paper • 2406.09414 • Published Jun 13, 2024 • 104

Classification Done Right for Vision-Language Pre-Training

Paper • 2411.03313 • Published Nov 5, 2024

Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

Paper • 2501.04001 • Published Jan 7 • 47

Video Depth Anything: Consistent Depth Estimation for Super-Long Videos

Paper • 2501.12375 • Published Jan 21 • 22

GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation

Paper • 2504.08736 • Published Apr 11 • 47

Pixel-SAIL: Single Transformer For Pixel-Grounded Understanding

Paper • 2504.10465 • Published Apr 14 • 27

The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer

Paper • 2504.10462 • Published Apr 14 • 15

Seed1.5-VL Technical Report

Paper • 2505.07062 • Published May 11 • 146

authored a paper over 1 year ago

Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Paper • 2401.10891 • Published Jan 19, 2024 • 62