-
Revisiting On-Policy Distillation: Empirical Failure Modes and Simple Fixes
Paper • 2603.25562 • Published • 13 -
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe
Paper • 2604.13016 • Published • 77 -
From P(y|x) to P(y): Investigating Reinforcement Learning in Pre-train Space
Paper • 2604.14142 • Published • 24 -
TIP: Token Importance in On-Policy Distillation
Paper • 2604.14084 • Published • 11
Hugo Laurençon
HugoLaurencon
AI & ML interests
None yet
Recent Activity
updated a collection 1 day ago
Papers LLM training tricks updated a collection 1 day ago
Papers LLM training tricks updated a collection 3 days ago
Papers LLM training tricks