-
In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss
Paper • 2402.10790 • Published • 42 -
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
Paper • 2408.03314 • Published • 57 -
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
Paper • 2403.09629 • Published • 77
Gabriel Pendl
jompaaa
·
AI & ML interests
None yet
Recent Activity
liked
a model
1 day ago
KRLabsOrg/lettucedect-base-modernbert-en-v1
liked
a model
1 day ago
knowledgator/gliclass-base-v2.0-rac-init
liked
a model
2 days ago
HuggingFaceTB/SmolLM2-1.7B-Instruct
Organizations
None yet
Collections
1
datasets
None public yet