CreatiPoster: Towards Editable and Controllable Multi-Layer Graphic Design Generation Paper • 2506.10890 • Published Jun 12 • 10
Layton: Latent Consistency Tokenizer for 1024-pixel Image Reconstruction and Generation by 256 Tokens Paper • 2503.08377 • Published Mar 11 • 2
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution Paper • 2409.12191 • Published Sep 18, 2024 • 78
Advancing Referring Expression Segmentation Beyond Single Image Paper • 2305.12452 • Published May 21, 2023
Shikra: Unleashing Multimodal LLM's Referential Dialogue Magic Paper • 2306.15195 • Published Jun 27, 2023
Described Object Detection: Liberating Object Detection with Flexible Expressions Paper • 2307.12813 • Published Jul 24, 2023 • 1
Co-Salient Object Detection with Co-Representation Purification Paper • 2303.07670 • Published Mar 14, 2023