Surrogate Signals from Format and Length: Reinforcement Learning for Solving Mathematical Problems without Ground Truth Answers Paper • 2505.19439 • Published 16 days ago • 30
ReSearch Collection Trained models as described in the paper "ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning" • 5 items • Updated Mar 27 • 6
ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning Paper • 2503.19470 • Published Mar 25 • 18