OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling Paper • 2506.20512 • Published 2 days ago • 30
ALE-Bench: A Benchmark for Long-Horizon Objective-Driven Algorithm Engineering Paper • 2506.09050 • Published 17 days ago • 7
Generative AI Act II: Test Time Scaling Drives Cognition Engineering Paper • 2504.13828 • Published Apr 18 • 17
Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme Paper • 2504.02587 • Published Apr 3 • 30
PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models Paper • 2501.03124 • Published Jan 6 • 14