matlok
's Collections
Papers - RoPE
updated
Resonance RoPE: Improving Context Length Generalization of Large
Language Models
Paper
•
2403.00071
•
Published
•
23
Scaling Laws of RoPE-based Extrapolation
Paper
•
2310.05209
•
Published
•
7
Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language
Models
Paper
•
2404.12387
•
Published
•
39
OpenELM: An Efficient Language Model Family with Open-source Training
and Inference Framework
Paper
•
2404.14619
•
Published
•
127
What needs to go right for an induction head? A mechanistic study of
in-context learning circuits and their formation
Paper
•
2404.07129
•
Published
•
3
Round and Round We Go! What makes Rotary Positional Encodings useful?
Paper
•
2410.06205
•
Published
•
1
ThunderKittens: Simple, Fast, and Adorable AI Kernels
Paper
•
2410.20399
•
Published
•
1