Submitted by iseesaw 98 A Survey of Reinforcement Learning for Large Reasoning Models · 39 authors 889 3
Submitted by taesiri 21 AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning · 23 authors 116 2
Submitted by TongZheng1999 16 CDE: Curiosity-Driven Exploration for Efficient Reinforcement Learning in Large Language Models · 11 authors 1
Submitted by spermwhale 6 The Majority is not always right: RL training for solution aggregation · 6 authors 2
Submitted by memyprokotow 5 <think> So let's replace this phrase with insult... </think> Lessons learned from generation of toxic texts with LLMs · 3 authors 2
Submitted by taesiri - HumanAgencyBench: Scalable Evaluation of Human Agency Support in AI Assistants · 4 authors 2