Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers Paper • 2505.21497 • Published 15 days ago • 97 • 1
Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models Paper • 2505.16854 • Published 20 days ago • 11 • 2
VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning Paper • 2503.13444 • Published Mar 17 • 16 • 2
VLog: Video-Language Models by Generative Retrieval of Narration Vocabulary Paper • 2503.09402 • Published Mar 12 • 8 • 2