VL-Cogito: Progressive Curriculum Reinforcement Learning for Advanced Multimodal Reasoning Paper • 2507.22607 • Published 25 days ago • 45
Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy Paper • 2507.01352 • Published Jul 2 • 53
The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks Paper • 2502.08235 • Published Feb 12 • 59