Improved Iterative Refinement for Chart-to-Code Generation via Structured Instruction Paper • 2506.14837 • Published 7 days ago • 8
Improved Iterative Refinement for Chart-to-Code Generation via Structured Instruction Paper • 2506.14837 • Published 7 days ago • 8
Vision Matters: Simple Visual Perturbations Can Boost Multimodal Math Reasoning Paper • 2506.09736 • Published 11 days ago • 10
Agentic Robot: A Brain-Inspired Framework for Vision-Language-Action Models in Embodied Agents Paper • 2505.23450 • Published 24 days ago • 9
Advancing Multimodal Reasoning via Reinforcement Learning with Cold Start Paper • 2505.22334 • Published 25 days ago • 36
Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO Paper • 2505.22453 • Published 24 days ago • 45
MMMR: Benchmarking Massive Multi-Modal Reasoning Tasks Paper • 2505.16459 • Published about 1 month ago • 45
MMMR: Benchmarking Massive Multi-Modal Reasoning Tasks Paper • 2505.16459 • Published about 1 month ago • 45
NodeRAG: Structuring Graph-based RAG with Heterogeneous Nodes Paper • 2504.11544 • Published Apr 15 • 42
Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill? Paper • 2504.06514 • Published Apr 9 • 39
Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill? Paper • 2504.06514 • Published Apr 9 • 39
On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective Paper • 2502.14296 • Published Feb 20 • 46
Open-Sora Plan: Open-Source Large Video Generation Model Paper • 2412.00131 • Published Nov 28, 2024 • 34
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published Nov 15, 2024 • 125
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published Nov 15, 2024 • 125
Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination Paper • 2411.03823 • Published Nov 6, 2024 • 50
BiomedGPT: A Unified and Generalist Biomedical Generative Pre-trained Transformer for Vision, Language, and Multimodal Tasks Paper • 2305.17100 • Published May 26, 2023 • 2
BenTo: Benchmark Task Reduction with In-Context Transferability Paper • 2410.13804 • Published Oct 17, 2024 • 20
BenTo: Benchmark Task Reduction with In-Context Transferability Paper • 2410.13804 • Published Oct 17, 2024 • 20