AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant Paper • 2410.18603 • Published Oct 24, 2024 • 33
ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows Paper • 2505.19897 • Published May 26 • 103
xbench: Tracking Agents Productivity Scaling with Profession-Aligned Real-World Evaluations Paper • 2506.13651 • Published Jun 16 • 9
MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents Paper • 2507.19478 • Published Jul 25 • 29
OpenCUA: Open Foundations for Computer-Use Agents Paper • 2508.09123 • Published 13 days ago • 28
OpenCUA: Open Foundations for Computer-Use Agents Paper • 2508.09123 • Published 13 days ago • 28
CoAct-1: Computer-using Agents with Coding as Actions Paper • 2508.03923 • Published 20 days ago • 14