view article Article Improving Hugging Face Training Efficiency Through Packing with Flash Attention By lwtr and 5 others • Aug 21, 2024 • 37
BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Competitive Large Language Model Baseline Paper • 2408.15079 • Published Aug 27, 2024 • 55