Cut Your Losses in Large-Vocabulary Language Models Paper • 2411.09009 • Published Nov 13, 2024 • 45
Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies Paper • 2407.13623 • Published Jul 18, 2024 • 54