CocoaBench: Evaluating Unified Digital Agents in the Wild Paper • 2604.11201 • Published 19 days ago • 36
CocoaBench: Evaluating Unified Digital Agents in the Wild Paper • 2604.11201 • Published 19 days ago • 36
How Vulnerable Are AI Agents to Indirect Prompt Injections? Insights from a Large-Scale Public Competition Paper • 2603.15714 • Published Mar 16