SakanaAI 's Collections

Reinforcement Learning Teachers

Students distilled from a 7B Reinforcement-Learned Teacher (RLT) from the paper "Reinforcement Learning Teachers of Test Time Scaling."