Return of the Encoder: Maximizing Parameter Efficiency for SLMs Paper • 2501.16273 • Published 3 days ago • 4
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though Paper • 2501.04682 • Published 22 days ago • 90
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models Paper • 2501.03262 • Published 27 days ago • 89