AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time Paper • 2505.24863 • Published 6 days ago • 84
HardTests: Synthesizing High-Quality Test Cases for LLM Coding Paper • 2505.24098 • Published 7 days ago • 41
Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models Paper • 2505.14810 • Published 16 days ago • 60
view article Article OpenEvolve: An Open Source Implementation of Google DeepMind's AlphaEvolve By codelion • 16 days ago • 18
view article Article Blazingly fast whisper transcriptions with Inference Endpoints By mfuntowicz and 5 others • 24 days ago • 67
OpenMathReasoning Collection Models and datasets from "AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset" • 7 items • Updated about 4 hours ago • 40
OpenCodeReasoning Collection Reasoning data for supervised finetuning of LLMs to advance data distillation for competitive coding • 7 items • Updated about 4 hours ago • 16
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations Paper • 2504.10481 • Published Apr 14 • 84
view article Article Introducing smolagents: simple agents that write actions in code. By m-ric and 2 others • Dec 31, 2024 • 1.06k
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model Paper • 2503.24290 • Published Mar 31 • 62
Expanding RL with Verifiable Rewards Across Diverse Domains Paper • 2503.23829 • Published Mar 31 • 22
ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning Paper • 2503.19470 • Published Mar 25 • 18
Open Deep Search: Democratizing Search with Open-source Reasoning Agents Paper • 2503.20201 • Published Mar 26 • 47
Think Twice: Enhancing LLM Reasoning by Scaling Multi-round Test-time Thinking Paper • 2503.19855 • Published Mar 25 • 28
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't Paper • 2503.16219 • Published Mar 20 • 51
view article Article NVIDIA's GTC 2025 Announcement for Physical AI Developers: New Open Models and Datasets By mingyuliutw and 4 others • Mar 18 • 41
Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond Paper • 2503.10460 • Published Mar 13 • 29