When to Trust Context: Self-Reflective Debates for Context Reliability Paper • 2506.06020 • Published Jun 6 • 1
MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning Paper • 2503.07459 • Published Mar 10 • 16