MaPPO: Maximum a Posteriori Preference Optimization with Prior Knowledge Paper • 2507.21183 • Published 27 days ago • 13
MixGRPO: Unlocking Flow-based GRPO Efficiency with Mixed ODE-SDE Paper • 2507.21802 • Published 25 days ago • 10
EDGE-GRPO: Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity Paper • 2507.21848 • Published 25 days ago • 7