Diving into Self-Evolving Training for Multimodal Reasoning Paper • 2412.17451 • Published Dec 23, 2024 • 44
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild Paper • 2503.18892 • Published Mar 24 • 31
Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging Paper • 2505.05464 • Published May 8 • 10
Learn to Reason Efficiently with Adaptive Length-based Reward Shaping Paper • 2505.15612 • Published 21 days ago • 32
Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist Paper • 2407.08733 • Published Jul 11, 2024 • 23
What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning Paper • 2312.15685 • Published Dec 25, 2023 • 16