Shilong Liu's picture

Shilong Liu

ShilongLiu

·

https://www.lsl.zone

SlongLiu

AI & ML interests

Computer vision. Machine learning. Agents

Recent Activity

upvoted a paper 23 days ago

Avenir-Web: Human-Experience-Imitating Multimodal Web Agents with Mixture of Grounding Experts

authored a paper 2 months ago

Web World Models

upvoted a paper 2 months ago

Web World Models

View all activity

Organizations

None yet

upvoted a paper 23 days ago

Avenir-Web: Human-Experience-Imitating Multimodal Web Agents with Mixture of Grounding Experts

Paper • 2602.02468 • Published 28 days ago • 2

authored a paper 2 months ago

Web World Models

Paper • 2512.23676 • Published Dec 29, 2025 • 27

upvoted a paper 2 months ago

Web World Models

Paper • 2512.23676 • Published Dec 29, 2025 • 27

authored 17 papers 9 months ago

Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection

Paper • 2303.05499 • Published Mar 9, 2023 • 4

A Simple Framework for Open-Vocabulary Segmentation and Detection

Paper • 2303.08131 • Published Mar 14, 2023

Detection Transformer with Stable Matching

Paper • 2304.04742 • Published Apr 10, 2023

TOSS:High-quality Text-guided Novel View Synthesis from a Single Image

Paper • 2310.10644 • Published Oct 16, 2023 • 1

InstructPix2NeRF: Instructed 3D Portrait Editing from a Single Image

Paper • 2311.02826 • Published Nov 6, 2023 • 1

Recognize Anything: A Strong Image Tagging Model

Paper • 2306.03514 • Published Jun 6, 2023 • 12

detrex: Benchmarking Detection Transformers

Paper • 2306.07265 • Published Jun 12, 2023

DFA3D: 3D Deformable Attention For 2D-to-3D Feature Lifting

Paper • 2307.12972 • Published Jul 24, 2023

Grounded SAM: Assembling Open-World Models for Diverse Visual Tasks

Paper • 2401.14159 • Published Jan 25, 2024 • 6

Neural Interactive Keypoint Detection

Paper • 2308.10174 • Published Aug 20, 2023

DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR

Paper • 2201.12329 • Published Jan 28, 2022

DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection

Paper • 2203.03605 • Published Mar 7, 2022

Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation

Paper • 2206.02777 • Published Jun 6, 2022

T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy

Paper • 2403.14610 • Published Mar 21, 2024 • 3

Vidu: a Highly Consistent, Dynamic and Skilled Text-to-Video Generator with Diffusion Models

Paper • 2405.04233 • Published May 7, 2024 • 3

CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents

Paper • 2407.01511 • Published Jul 1, 2024

TAPTRv2: Attention-based Position Update Improves Tracking Any Point

Paper • 2407.16291 • Published Jul 23, 2024 • 11