
SakanaAI/RLT-7B
Text Generation
•
8B
•
Updated
•
424
•
•
12
Students distilled from a 7B Reinforcement-Learned Teacher (RLT) from the paper "Reinforcement Learning Teachers of Test Time Scaling."