Training Long-Context, Multi-Turn Software Engineering Agents with Reinforcement Learning Paper • 2508.03501 • Published 18 days ago • 53
Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training Paper • 2508.00414 • Published 22 days ago • 86
GeRe: Towards Efficient Anti-Forgetting in Continual Learning of LLM via General Samples Replay Paper • 2508.04676 • Published 17 days ago • 4
ASTRA: Autonomous Spatial-Temporal Red-teaming for AI Software Assistants Paper • 2508.03936 • Published 18 days ago • 9
HierSearch: A Hierarchical Enterprise Deep Search Framework Integrating Local and Web Searches Paper • 2508.08088 • Published 12 days ago • 28
Sample More to Think Less: Group Filtered Policy Optimization for Concise Reasoning Paper • 2508.09726 • Published 10 days ago • 11
Learning to Align, Aligning to Learn: A Unified Approach for Self-Optimized Alignment Paper • 2508.07750 • Published 12 days ago • 19
UI-Venus Technical Report: Building High-performance UI Agents with RFT Paper • 2508.10833 • Published 9 days ago • 38
PRELUDE: A Benchmark Designed to Require Global Comprehension and Reasoning over Long Contexts Paper • 2508.09848 • Published 10 days ago • 65
Hop, Skip, and Overthink: Diagnosing Why Reasoning Models Fumble during Multi-Hop Analysis Paper • 2508.04699 • Published 17 days ago • 2
PRvL: Quantifying the Capabilities and Risks of Large Language Models for PII Redaction Paper • 2508.05545 • Published 16 days ago • 2
InfiAlign: A Scalable and Sample-Efficient Framework for Aligning LLMs to Enhance Reasoning Capabilities Paper • 2508.05496 • Published 16 days ago • 9
Don't Overthink It: A Survey of Efficient R1-style Large Reasoning Models Paper • 2508.02120 • Published 19 days ago • 18
Can Large Multimodal Models Actively Recognize Faulty Inputs? A Systematic Evaluation Framework of Their Input Scrutiny Ability Paper • 2508.04017 • Published 17 days ago • 11