Uploaded GGUF and exl2 as Phi 3.1
The change in performance is so huge you really are doing yourselves a disservice by not renaming it! It may get swept under the rug because people will assume you just updated the README
I've uploaded GGUF and EXL2 here as Phi 3.1:
https://huggingface.co/bartowski/Phi-3.1-mini-4k-instruct-GGUF
https://huggingface.co/bartowski/Phi-3.1-mini-4k-instruct-exl2
Looks like they bumped the mini-128k too.
yeah sadly 128k still isn't supported in llama.cpp :(
NotImplementedError: The rope scaling type longrope is not supported yet
it's possible you could create them but it would just be the same as the 4k model in practice
see I thought it had too, thank you for finding that.. looking at the changelog they may have changed it to a new rope method :') it used to be a regular rope with short factor and long factor, now it's their new longrope...
All the more important to distinguish between the versions, then.