Submitted by iseesaw 166 A Survey of Reinforcement Learning for Large Reasoning Models · 39 authors 1.43k 5
Submitted by taesiri 54 AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning · 23 authors 394 2
Submitted by TongZheng1999 28 CDE: Curiosity-Driven Exploration for Efficient Reinforcement Learning in Large Language Models · 11 authors 2
Submitted by spermwhale 16 The Majority is not always right: RL training for solution aggregation · 6 authors 2
Submitted by memyprokotow 13 <think> So let's replace this phrase with insult... </think> Lessons learned from generation of toxic texts with LLMs · 3 authors 2
Submitted by taesiri - HumanAgencyBench: Scalable Evaluation of Human Agency Support in AI Assistants · 4 authors 2