matlok
's Collections
Papers - Context
updated
In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs
Miss
Paper
•
2402.10790
•
Published
•
42
LongAgent: Scaling Language Models to 128k Context through Multi-Agent
Collaboration
Paper
•
2402.11550
•
Published
•
17
A Neural Conversational Model
Paper
•
1506.05869
•
Published
•
2
Data Engineering for Scaling Language Models to 128K Context
Paper
•
2402.10171
•
Published
•
24
World Model on Million-Length Video And Language With RingAttention
Paper
•
2402.08268
•
Published
•
38
GrowLength: Accelerating LLMs Pretraining by Progressively Growing
Training Length
Paper
•
2310.00576
•
Published
•
2
Ring Attention with Blockwise Transformers for Near-Infinite Context
Paper
•
2310.01889
•
Published
•
10
Scaling Laws of RoPE-based Extrapolation
Paper
•
2310.05209
•
Published
•
7
Extending Context Window of Large Language Models via Positional
Interpolation
Paper
•
2306.15595
•
Published
•
53
Longformer: The Long-Document Transformer
Paper
•
2004.05150
•
Published
•
3
BurstAttention: An Efficient Distributed Attention Framework for
Extremely Long Sequences
Paper
•
2403.09347
•
Published
•
21
Leave No Context Behind: Efficient Infinite Context Transformers with
Infini-attention
Paper
•
2404.07143
•
Published
•
106
RULER: What's the Real Context Size of Your Long-Context Language
Models?
Paper
•
2404.06654
•
Published
•
35
LLoCO: Learning Long Contexts Offline
Paper
•
2404.07979
•
Published
•
21
Megalodon: Efficient LLM Pretraining and Inference with Unlimited
Context Length
Paper
•
2404.08801
•
Published
•
65
Length Generalization of Causal Transformers without Position Encoding
Paper
•
2404.12224
•
Published
•
1
Paper
•
2407.10671
•
Published
•
161
NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context
Window?
Paper
•
2407.11963
•
Published
•
44