In-Context Reinforcement Learning for Tool Use in Large Language Models Paper • 2603.08068 • Published 10 days ago • 37
Can Large Language Models Keep Up? Benchmarking Online Adaptation to Continual Knowledge Streams Paper • 2603.07392 • Published 11 days ago • 17
BandPO: Bridging Trust Regions and Ratio Clipping via Probability-Aware Bounds for LLM Reinforcement Learning Paper • 2603.04918 • Published 14 days ago • 55
BeyondSWE: Can Current Code Agent Survive Beyond Single-Repo Bug Fixing? Paper • 2603.03194 • Published 15 days ago • 55
SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale Paper • 2602.23866 • Published 19 days ago • 86