Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving Paper • 2507.06229 • Published 5 days ago • 66
MedAgentGym: Training LLM Agents for Code-Based Medical Reasoning at Scale Paper • 2506.04405 • Published Jun 4 • 4
MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering Paper • 2505.07782 • Published May 12 • 18
MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning Paper • 2503.07459 • Published Mar 10 • 16
Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-Training Paper • 2502.06589 • Published Feb 10 • 19
ToolChain*: Efficient Action Space Navigation in Large Language Models with A* Search Paper • 2310.13227 • Published Oct 20, 2023 • 13