MathReal: We Keep It Real! A Real Scene Benchmark for Evaluating Math Reasoning in Multimodal Large Language Models Paper • 2508.06009 • Published 15 days ago • 15
Vision Matters: Simple Visual Perturbations Can Boost Multimodal Math Reasoning Paper • 2506.09736 • Published Jun 11 • 10
Agentic Robot: A Brain-Inspired Framework for Vision-Language-Action Models in Embodied Agents Paper • 2505.23450 • Published May 29 • 9
Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO Paper • 2505.22453 • Published May 28 • 46
Advancing Multimodal Reasoning via Reinforcement Learning with Cold Start Paper • 2505.22334 • Published May 28 • 37
NodeRAG: Structuring Graph-based RAG with Heterogeneous Nodes Paper • 2504.11544 • Published Apr 15 • 43
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published Nov 15, 2024 • 128
MLP-KAN: Unifying Deep Representation and Function Learning Paper • 2410.03027 • Published Oct 3, 2024 • 32