FutureX: An Advanced Live Benchmark for LLM Agents in Future Prediction Paper • 2508.11987 • Published 7 days ago • 53
Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL Paper • 2508.13167 • Published 16 days ago • 99
Speed Always Wins: A Survey on Efficient Architectures for Large Language Models Paper • 2508.09834 • Published 9 days ago • 45
Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with Long-Term Memory Paper • 2508.09736 • Published 9 days ago • 49
Matrix-3D: Omnidirectional Explorable 3D World Generation Paper • 2508.08086 • Published 11 days ago • 67
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent Paper • 2508.05748 • Published 15 days ago • 116
BrowseComp-Plus: A More Fair and Transparent Evaluation Benchmark of Deep-Research Agent Paper • 2508.06600 • Published 14 days ago • 35
OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use Paper • 2508.04482 • Published 16 days ago • 9
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems Paper • 2508.07407 • Published 12 days ago • 82
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models Paper • 2508.06471 • Published 14 days ago • 155
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens Paper • 2508.01191 • Published 21 days ago • 219
LiveMCPBench: Can Agents Navigate an Ocean of MCP Tools? Paper • 2508.01780 • Published 19 days ago • 13
CoAct-1: Computer-using Agents with Coding as Actions Paper • 2508.03923 • Published 17 days ago • 14
SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience Paper • 2508.04700 • Published 16 days ago • 47
Efficient Agents: Building Effective Agents While Reducing Cost Paper • 2508.02694 • Published 29 days ago • 81