Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning Paper • 2503.09516 • Published Mar 12 • 29
Learning to Learn Faster from Human Feedback with Language Model Predictive Control Paper • 2402.11450 • Published Feb 18, 2024 • 23