The Climb Carves Wisdom Deeper Than the Summit: On the Noisy Rewards in Learning to Reason Paper • 2505.22653 • Published May 28 • 67
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning Paper • 2505.17667 • Published May 23 • 89
Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning Paper • 2505.16410 • Published May 22 • 57
Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models Paper • 2505.14810 • Published May 20 • 63
Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback Paper • 2501.12895 • Published Jan 22 • 62
Model Merging Collection Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12, 2024 • 244