peizesun's picture

1 5

peizesun

peizesun

·

https://peizesun.github.io/

PeizeSun

AI & ML interests

None yet

Recent Activity

authored a paper 15 days ago

ByteTrack: Multi-Object Tracking by Associating Every Detection Box

authored a paper 15 days ago

DanceTrack: Multi-Object Tracking in Uniform Appearance and Diverse Motion

authored a paper 15 days ago

DiffusionDet: Diffusion Model for Object Detection

View all activity

Organizations

peizesun's activity

authored 11 papers 15 days ago

ByteTrack: Multi-Object Tracking by Associating Every Detection Box

Paper • 2110.06864 • Published Oct 13, 2021

DanceTrack: Multi-Object Tracking in Uniform Appearance and Diverse Motion

Paper • 2111.14690 • Published Nov 29, 2021

DiffusionDet: Diffusion Model for Object Detection

Paper • 2211.09788 • Published Nov 17, 2022

IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model

Paper • 2407.07577 • Published Jul 10, 2024

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

Paper • 2011.12450 • Published Nov 25, 2020

Prompt-A-Video: Prompt Your Video Diffusion Model via Preference-Aligned LLM

Paper • 2412.15156 • Published Dec 19, 2024

Goku: Flow Based Video Generative Foundation Models

Paper • 2502.04896 • Published Feb 7 • 104

Language as Queries for Referring Video Object Segmentation

Paper • 2201.00487 • Published Jan 3, 2022

PixelFlow: Pixel-Space Generative Models with Flow

Paper • 2504.07963 • Published 25 days ago • 19

Perception Encoder: The best visual embeddings are not at the output of the network

Paper • 2504.13181 • Published 18 days ago • 34

PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding

Paper • 2504.13180 • Published 18 days ago • 17

upvoted a collection 17 days ago

Perception Encoder

9 items • Updated 18 days ago • 38

upvoted 2 papers 17 days ago

PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding

Paper • 2504.13180 • Published 18 days ago • 17

Perception Encoder: The best visual embeddings are not at the output of the network

Paper • 2504.13181 • Published 18 days ago • 34

upvoted 2 papers 3 months ago

Goku: Flow Based Video Generative Foundation Models

Paper • 2502.04896 • Published Feb 7 • 104

FlashVideo:Flowing Fidelity to Detail for Efficient High-Resolution Video Generation

Paper • 2502.05179 • Published Feb 7 • 24

authored a paper 3 months ago

FlashVideo:Flowing Fidelity to Detail for Efficient High-Resolution Video Generation

Paper • 2502.05179 • Published Feb 7 • 24

authored 2 papers 7 months ago

Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment

Paper • 2410.09347 • Published Oct 12, 2024 • 5

ControlAR: Controllable Image Generation with Autoregressive Models

Paper • 2410.02705 • Published Oct 3, 2024 • 11

updated a Space 10 months ago

LlamaGen