Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs Paper • 2506.19290 • Published 14 days ago • 49
HardTests: Synthesizing High-Quality Test Cases for LLM Coding Paper • 2505.24098 • Published May 30 • 44
LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning Paper • 2506.18841 • Published 15 days ago • 56
SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks Paper • 2506.10954 • Published 26 days ago • 51
REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards Paper • 2505.24760 • Published May 30 • 64
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models Paper • 2505.24864 • Published May 30 • 132
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning Paper • 2506.01939 • Published Jun 2 • 166
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning Paper • 2505.24726 • Published May 30 • 258
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention Paper • 2506.13585 • Published 22 days ago • 250
Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs Paper • 2311.02262 • Published Nov 3, 2023 • 15