Vision Matters: Simple Visual Perturbations Can Boost Multimodal Math Reasoning Paper • 2506.09736 • Published Jun 11 • 10
Agentic Robot: A Brain-Inspired Framework for Vision-Language-Action Models in Embodied Agents Paper • 2505.23450 • Published May 29 • 9
Evaluating and Steering Modality Preferences in Multimodal Large Language Model Paper • 2505.20977 • Published May 27 • 9
Advancing Multimodal Reasoning via Reinforcement Learning with Cold Start Paper • 2505.22334 • Published May 28 • 37
Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO Paper • 2505.22453 • Published May 28 • 46
NodeRAG: Structuring Graph-based RAG with Heterogeneous Nodes Paper • 2504.11544 • Published Apr 15 • 42
Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM's Reasoning Capability Paper • 2411.19943 • Published Nov 29, 2024 • 64
An adapted large language model facilitates multiple medical tasks in diabetes care Paper • 2409.13191 • Published Sep 20, 2024 • 8
InstructionGPT-4: A 200-Instruction Paradigm for Fine-Tuning MiniGPT-4 Paper • 2308.12067 • Published Aug 23, 2023 • 4