LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming? Paper • 2506.11928 • Published Jun 13 • 24
Dynamic Risk Assessments for Offensive Cybersecurity Agents Paper • 2505.18384 • Published May 23 • 8