Prior Prompt Engineering for Reinforcement Fine-Tuning Paper • 2505.14157 • Published 22 days ago • 5
M-Prometheus Collection Open multilingual LLM judges for automatic evaluation. • 6 items • Updated Apr 8 • 6
An Open Recipe: Adapting Language-Specific LLMs to a Reasoning Model in One Day via Model Merging Paper • 2502.09056 • Published Feb 13 • 32
Reasoning Datasets Collection Reasoning datasets that are trending 🔥 • 10 items • Updated Jan 3 • 25