Improved Iterative Refinement for Chart-to-Code Generation via Structured Instruction Paper • 2506.14837 • Published Jun 15 • 11 • 2
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models Paper • 2506.06395 • Published Jun 5 • 129 • 20
Vision Matters: Simple Visual Perturbations Can Boost Multimodal Math Reasoning Paper • 2506.09736 • Published Jun 11 • 10 • 2
Advancing Multimodal Reasoning via Reinforcement Learning with Cold Start Paper • 2505.22334 • Published May 28 • 37 • 2
Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO Paper • 2505.22453 • Published May 28 • 46 • 2