Text2Grad: Reinforcement Learning from Natural Language Feedback Paper • 2505.22338 • Published 14 days ago • 7
Datasheets Aren't Enough: DataRubrics for Automated Quality Metrics and Accountability Paper • 2506.01789 • Published 9 days ago • 13
Datasheets Aren't Enough: DataRubrics for Automated Quality Metrics and Accountability Paper • 2506.01789 • Published 9 days ago • 13
Web-Shepherd: Advancing PRMs for Reinforcing Web Agents Paper • 2505.15277 • Published 21 days ago • 99
FREESON: Retriever-Free Retrieval-Augmented Reasoning via Corpus-Traversing MCTS Paper • 2505.16409 • Published 20 days ago • 2
The CoT Encyclopedia: Analyzing, Predicting, and Controlling how a Reasoning Model will Think Paper • 2505.10185 • Published 27 days ago • 25
The CoT Encyclopedia: Analyzing, Predicting, and Controlling how a Reasoning Model will Think Paper • 2505.10185 • Published 27 days ago • 25
Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators Paper • 2503.19877 • Published Mar 25 • 1