HARP: Hesitation-Aware Reframing in Transformer Inference Pass Paper • 2412.07282 • Published Dec 10, 2024 • 4
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only Paper • 2306.01116 • Published Jun 1, 2023 • 35
What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization? Paper • 2204.05832 • Published Apr 12, 2022
What Language Model to Train if You Have One Million GPU Hours? Paper • 2210.15424 • Published Oct 27, 2022 • 2
Augmenting Autotelic Agents with Large Language Models Paper • 2305.12487 • Published May 21, 2023 • 1
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Paper • 2211.05100 • Published Nov 9, 2022 • 32