To Softmax, or not to Softmax: that is the question when applying Active Learning for Transformer Models Paper • 2210.03005 • Published Oct 6, 2022 • 1
Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics Paper • 2410.21272 • Published Oct 28, 2024 • 2
MAMUT: A Novel Framework for Modifying Mathematical Formulas for the Generation of Specialized Datasets for Language Model Training Paper • 2502.20855 • Published Feb 28