GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learning Paper • 2505.17022 • Published May 22 • 27
Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models Paper • 2505.10554 • Published May 15 • 120
Even Small Reasoners Should Quote Their Sources: Introducing the Pleias-RAG Model Family Paper • 2504.18225 • Published Apr 25 • 13
JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse Paper • 2503.16365 • Published Mar 20 • 41
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing Paper • 2503.10639 • Published Mar 13 • 52
LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models Paper • 2407.12772 • Published Jul 17, 2024 • 36
OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference Paper • 2502.18411 • Published Feb 25 • 75