NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation Paper • 2504.13055 • Published Apr 17 • 19 • 2
Efficient Process Reward Model Training via Active Learning Paper • 2504.10559 • Published Apr 14 • 13 • 2