Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding Paper โข 2505.16990 โข Published 19 days ago โข 20 โข 4
Introducing Visual Perception Token into Multimodal Large Language Model Paper โข 2502.17425 โข Published Feb 24 โข 15 โข 2
Attention Prompting on Image for Large Vision-Language Models Paper โข 2409.17143 โข Published Sep 25, 2024 โข 7 โข 2