You made gold!
Nice model, I frankenmerged it with itself for fun and it still works. The problem, like you already mentioned, are GPTisms.
Cool! How did you merge it with itself?
I used mergekit(https://github.com/cg123/mergekit), you can see my config in config.yml. Merged model is https://huggingface.co/ChuckMcSneed/DoubleGold-v0.1-123b-32k btw.
Thanks!
BTW, I was able to overcome most of the "GPTisms" in the next iteration of Aurelian, but ran into an issue while training and need to revert and continue from an older CP (I traded diverse prose for loss of long-context capability, which is the whole point of the model). Hopefully I have a fixed version releasing in about a week, and I think it is possible to have both.
Details in the recent post in the Reddit thread.
Edit: Fixed version (interim) here: https://huggingface.co/grimulkan/aurelian-v0.5-70b-rope8-32K-fp16