Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models Paper • 2508.10751 • Published 9 days ago • 25
Test-Time Reinforcement Learning for GUI Grounding via Region Consistency Paper • 2508.05615 • Published 16 days ago • 20
Beyond Ten Turns: Unlocking Long-Horizon Agentic Search with Large-Scale Asynchronous RL Paper • 2508.07976 • Published 12 days ago • 46
BrowseComp-Plus: A More Fair and Transparent Evaluation Benchmark of Deep-Research Agent Paper • 2508.06600 • Published 15 days ago • 36
Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization Paper • 2508.07629 • Published 13 days ago • 39
OmniEAR: Benchmarking Agent Reasoning in Embodied Tasks Paper • 2508.05614 • Published 16 days ago • 18
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models Paper • 2508.06471 • Published 15 days ago • 156
Position: The Current AI Conference Model is Unsustainable! Diagnosing the Crisis of Centralized AI Conference Paper • 2508.04586 • Published 17 days ago • 12
Efficient Agents: Building Effective Agents While Reducing Cost Paper • 2508.02694 • Published about 1 month ago • 82
R-Zero: Self-Evolving Reasoning LLM from Zero Data Paper • 2508.05004 • Published 17 days ago • 117 • 3