RL - a cooleel Collection

cooleel 's Collections

RL

general

LLMs

Agent

vlms

DocAI

RL

updated 6 days ago

Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning

Paper • 2502.14768 • Published 17 days ago • 44
S^2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning

Paper • 2502.12853 • Published 20 days ago • 28
Diverse Inference and Verification for Advanced Reasoning

Paper • 2502.09955 • Published 24 days ago • 17
Distillation Scaling Laws

Paper • 2502.08606 • Published 25 days ago • 46
Small Models Struggle to Learn from Strong Reasoners

Paper • 2502.12143 • Published 20 days ago • 28
OctoTools: An Agentic Framework with Extensible Tools for Complex Reasoning

Paper • 2502.11271 • Published 21 days ago • 16
CRANE: Reasoning with constrained LLM generation

Paper • 2502.09061 • Published 25 days ago • 18
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 341
Visual-RFT: Visual Reinforcement Fine-Tuning

Paper • 2503.01785 • Published 6 days ago • 59