view article Article Cosmopedia: how to create large-scale synthetic data for pre-training Large Language Models By loubnabnl and 2 others β’ Mar 20, 2024 β’ 94