DisPose: Disentangling Pose Guidance for Controllable Human Image Animation Paper • 2412.09349 • Published Dec 12, 2024 • 8
Identity-Preserving Text-to-Video Generation by Frequency Decomposition Paper • 2411.17440 • Published Nov 26, 2024 • 35
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters Paper • 2410.23168 • Published Oct 30, 2024 • 24
MIBench: Evaluating Multimodal Large Language Models over Multiple Images Paper • 2407.15272 • Published Jul 21, 2024 • 10
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators Paper • 2404.05014 • Published Apr 7, 2024 • 33
Continuous-Multiple Image Outpainting in One-Step via Positional Query and A Diffusion-based Approach Paper • 2401.15652 • Published Jan 28, 2024
ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation Paper • 2406.18522 • Published Jun 26, 2024 • 20
ARNOLD: A Benchmark for Language-Grounded Task Learning With Continuous States in Realistic 3D Scenes Paper • 2304.04321 • Published Apr 9, 2023
Perceive, Ground, Reason, and Act: A Benchmark for General-purpose Visual Representation Paper • 2211.15402 • Published Nov 28, 2022
Explicit Shape Encoding for Real-Time Instance Segmentation Paper • 1908.04067 • Published Aug 12, 2019
UniTR: A Unified and Efficient Multi-Modal Transformer for Bird's-Eye-View Representation Paper • 2308.07732 • Published Aug 15, 2023 • 2
CAGroup3D: Class-Aware Grouping for 3D Object Detection on Point Clouds Paper • 2210.04264 • Published Oct 9, 2022
UniTR: A Unified and Efficient Multi-Modal Transformer for Bird's-Eye-View Representation Paper • 2308.07732 • Published Aug 15, 2023 • 2
DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets Paper • 2301.06051 • Published Jan 15, 2023 • 1
GiT: Towards Generalist Vision Transformer through Universal Language Interface Paper • 2403.09394 • Published Mar 14, 2024 • 26
GiT: Towards Generalist Vision Transformer through Universal Language Interface Paper • 2403.09394 • Published Mar 14, 2024 • 26