
SakanaAI/RLT-7B
Text Generation
•
8B
•
Updated
•
79
•
•
7
Students distilled from a 7B Reinforcement-Learned Teacher (RLT) from the paper "Reinforcement Learning Teachers of Test Time Scaling."