RADLADS
Collection
7 items
•
Updated
•
4
This repository contains the model described in the paper RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale.
Github repository: https://github.com/recursal/Monet