view article Article From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels By drbh and 1 other β’ Aug 18 β’ 75
view article Article The Transformers Library: standardizing model definitions By lysandre and 3 others β’ May 15 β’ 118
Retentive Network: A Successor to Transformer for Large Language Models Paper β’ 2307.08621 β’ Published Jul 17, 2023 β’ 172