Fostering Video Reasoning via Next-Event Prediction Paper • 2505.22457 • Published about 1 month ago • 27
Optimizing Anytime Reasoning via Budget Relative Policy Optimization Paper • 2505.13438 • Published May 19 • 35
Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis Paper • 2505.13227 • Published May 19 • 45
view article Article Accelerating LLM Inference: Fast Sampling with Gumbel-Max Trick By cxdu • Oct 24, 2024 • 12
NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation Paper • 2504.13055 • Published Apr 17 • 19
🚀 Active PRM Collection Efficient Process Reward Model Training via Active Learning. • 4 items • Updated Apr 16 • 3
Understanding R1-Zero-Like Training: A Critical Perspective Paper • 2503.20783 • Published Mar 26 • 52
Efficient Process Reward Model Training via Active Learning Paper • 2504.10559 • Published Apr 14 • 13
Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs Paper • 2502.12982 • Published Feb 18 • 18
Exploring Multi-Grained Concept Annotations for Multimodal Large Language Models Paper • 2412.05939 • Published Dec 8, 2024 • 16