RESTRAIN: From Spurious Votes to Signals -- Self-Driven RL with Self-Penalization Paper • 2510.02172 • Published Oct 2, 2025 • 7