MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning Paper • 2503.07459 • Published Mar 10 • 16
Alpha-SQL: Zero-Shot Text-to-SQL using Monte Carlo Tree Search Paper • 2502.17248 • Published Feb 24 • 1
A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence Paper • 2507.21046 • Published 27 days ago • 79