PhilipQuirke/Accurate6DigitSubtraction

Contains files for a Transformer model that answers 6-digit subtraction questions (e.g. 123450-345670=-0123230) with very low loss (1e-8).

This subtraction model has 3 layers, 4 attention heads, d-model = 510, d-head = 170. The subtraction model was initialised with a very-low-loss Addition model (2 layers, 3 attention heads, 9e-9 loss), before being trained for 45K epochs.

The CoLab used to train the model is here: https://github.com/apartresearch/Verified_addition/blob/main/assets/Accurate_Math_Train.ipynb

The CoLab used to analyse the model is here: https://github.com/apartresearch/Verified_addition/blob/main/assets/Accurate_Math_Analyse.ipynb