Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs Paper • 2506.19290 • Published 13 days ago • 49
HardTests: Synthesizing High-Quality Test Cases for LLM Coding Paper • 2505.24098 • Published May 30 • 44
LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning Paper • 2506.18841 • Published 14 days ago • 55
SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks Paper • 2506.10954 • Published 24 days ago • 51
REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards Paper • 2505.24760 • Published May 30 • 64
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models Paper • 2505.24864 • Published May 30 • 132
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning Paper • 2506.01939 • Published Jun 2 • 165
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning Paper • 2505.24726 • Published May 30 • 258
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention Paper • 2506.13585 • Published 21 days ago • 250
AgentInstruct: Toward Generative Teaching with Agentic Flows Paper • 2407.03502 • Published Jul 3, 2024 • 51 • 15
Running on CPU Upgrade 13.3k 13.3k Open LLM Leaderboard 🏆 Track, rank and evaluate open LLMs and chatbots
Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs Paper • 2311.02262 • Published Nov 3, 2023 • 15
TeacherLM: Teaching to Fish Rather Than Giving the Fish, Language Modeling Likewise Paper • 2310.19019 • Published Oct 29, 2023 • 9 • 3