X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents Paper • 2504.13203 • Published Apr 15 • 31
MOSAIC: Modeling Social AI for Content Dissemination and Regulation in Multi-Agent Simulations Paper • 2504.07830 • Published Apr 10 • 18
SciCode: A Research Coding Benchmark Curated by Scientists Paper • 2407.13168 • Published Jul 18, 2024 • 14