-
S*: Test Time Scaling for Code Generation
Paper • 2502.14382 • Published • 63 -
o1-Coder: an o1 Replication for Coding
Paper • 2412.00154 • Published • 45 -
Competitive Programming with Large Reasoning Models
Paper • 2502.06807 • Published • 70 -
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution
Paper • 2502.18449 • Published • 74
Collections
Discover the best community collections!
Collections including paper arxiv:2505.16400
-
nvidia/AceReason-Nemotron-14B
Text Generation • Updated • 44.2k • • 78 -
nvidia/AceReason-Nemotron-7B
Text Generation • Updated • 39k • • 11 -
AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning
Paper • 2505.16400 • Published • 30 -
nvidia/AceReason-Math
Viewer • Updated • 49.6k • 339 • 4
-
J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning
Paper • 2505.10320 • Published • 22 -
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures
Paper • 2505.09343 • Published • 63 -
Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models
Paper • 2505.10554 • Published • 118 -
Scaling Reasoning can Improve Factuality in Large Language Models
Paper • 2505.11140 • Published • 6
-
Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers
Paper • 2504.20752 • Published • 91 -
Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math
Paper • 2504.21233 • Published • 45 -
AF Adapter: Continual Pretraining for Building Chinese Biomedical Language Model
Paper • 2211.11363 • Published • 1 -
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning
Paper • 2405.12130 • Published • 51