nvidia/Llama-Nemotron-Post-Training-Dataset Viewer • Updated about 14 hours ago • 3.91M • 4.69k • 405
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 381