OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling Paper • 2506.20512 • Published 2 days ago • 30
🐙 OctoThinker Collection Mid-training Incentivizes Reinforcement Learning Scaling • 18 items • Updated 1 day ago • 1
OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling Paper • 2506.20512 • Published 2 days ago • 30
🐙 OctoThinker Collection Mid-training Incentivizes Reinforcement Learning Scaling • 18 items • Updated 1 day ago • 1
🧙 Guru Collection Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective • 4 items • Updated 7 days ago
Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective Paper • 2506.14965 • Published 10 days ago • 42
Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective Paper • 2506.14965 • Published 10 days ago • 42
ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows Paper • 2505.19897 • Published May 26 • 102
Learn to Reason Efficiently with Adaptive Length-based Reward Shaping Paper • 2505.15612 • Published May 21 • 33