Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
yifanzhang114 's Collections
R1-Reward
MM-RLHF
SliME
MME-RealWorld

R1-Reward

updated May 6

Training Multimodal Reward Model Through Stable Reinforcement Learning

Upvote
-

  • yifanzhang114/R1-Reward-RL

    Viewer • Updated about 1 month ago • 17.3k • 212 • 3

  • yifanzhang114/R1-Reward

    8B • Updated May 9 • 96 • 6

  • R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning

    Paper • 2505.02835 • Published May 5 • 27
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs