RL thinking - a Augusteinia Collection

Augusteinia 's Collections

Math

VLM

3DV

RL thinking

updated 1 day ago

J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning

Paper • 2505.10320 • Published 6 days ago • 17
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures

Paper • 2505.09343 • Published 7 days ago • 55
Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models

Paper • 2505.10554 • Published 6 days ago • 109
Scaling Reasoning can Improve Factuality in Large Language Models

Paper • 2505.11140 • Published 5 days ago • 5