Uggghhh what a busy week. Been up all night fixing Intellite and I got some progress I wanted to share.
ALL errors we experienced (there’s a lot): Gradient vanishing (AI dies) Gradient explosion (AI barely learns and destabilizes) Activation Explosion (AI dies) Layer Scale Crushing Signal (AI dies) Activation Explosion (again) (AI dies) All fixed, but now it keeps overfitting on datasets due to a dataset handler error 😩
All your GPUs you have suddenly vanish and any devices with a GPU turn to CPU only
Every time you use a HF space you randomly start dancing for 5 minutes OR Every HF model you’ve hearted / liked suddenly disappears and becomes inaccessible
Training works fine, normal loss, no more gradient explosion or gradient vanishing, etc. BUT, before I officially flip the switch and turn on training, I wanna make sure its the best possible 100m parameter model it can be, so I am working a bit more (probably an extra 3-5 days) to add even more innovative AI improvements to intellite.