-
DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior
Paper • 2310.16818 • Published • 32 -
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper • 2401.02954 • Published • 47 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper • 2401.06066 • Published • 54 -
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Paper • 2401.14196 • Published • 62
Collections
Discover the best community collections!
Collections including paper arxiv:2504.02495
-
Inference-Time Scaling for Generalist Reward Modeling
Paper • 2504.02495 • Published • 27 -
Large Language Diffusion Models
Paper • 2502.09992 • Published • 110 -
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper • 2502.02737 • Published • 220 -
WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training
Paper • 2501.18511 • Published • 19
-
Competitive Programming with Large Reasoning Models
Paper • 2502.06807 • Published • 70 -
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling
Paper • 2502.06703 • Published • 149 -
Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning
Paper • 2502.06781 • Published • 61 -
LIMO: Less is More for Reasoning
Paper • 2502.03387 • Published • 61
-
Scaling LLM Inference with Optimized Sample Compute Allocation
Paper • 2410.22480 • Published -
Test-time Computing: from System-1 Thinking to System-2 Thinking
Paper • 2501.02497 • Published • 45 -
Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective
Paper • 2412.14135 • Published -
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though
Paper • 2501.04682 • Published • 96