Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning Paper • 2508.08221 • Published 11 days ago • 40
CoT-Self-Instruct: Building high-quality synthetic prompts for reasoning and non-reasoning tasks Paper • 2507.23751 • Published 22 days ago • 4
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems Paper • 2508.07407 • Published 12 days ago • 82
Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination Paper • 2507.10532 • Published Jul 14 • 85
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published May 6 • 184