Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Augusteinia 's Collections
Paradigm
Math
VLM
3DV
RL thinking

RL thinking

updated 1 day ago
Upvote
-

  • J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning

    Paper • 2505.10320 • Published 6 days ago • 17

  • Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures

    Paper • 2505.09343 • Published 7 days ago • 55

  • Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models

    Paper • 2505.10554 • Published 6 days ago • 109

  • Scaling Reasoning can Improve Factuality in Large Language Models

    Paper • 2505.11140 • Published 5 days ago • 5
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs