Uploaded GGUF and exl2 as Phi 3.1

#80

by bartowski - opened Jul 2, 2024

Jul 2, 2024

The change in performance is so huge you really are doing yourselves a disservice by not renaming it! It may get swept under the rug because people will assume you just updated the README

I've uploaded GGUF and EXL2 here as Phi 3.1:

https://huggingface.co/bartowski/Phi-3.1-mini-4k-instruct-GGUF

https://huggingface.co/bartowski/Phi-3.1-mini-4k-instruct-exl2

joshuaturner

Jul 2, 2024

Looks like they bumped the mini-128k too.

bartowski

Jul 2, 2024

yeah sadly 128k still isn't supported in llama.cpp :(

NotImplementedError: The rope scaling type longrope is not supported yet

it's possible you could create them but it would just be the same as the 4k model in practice

joshuaturner

Jul 2, 2024

Thought this had been sorted... https://github.com/ggerganov/llama.cpp/pull/7225

bartowski

Jul 2, 2024

see I thought it had too, thank you for finding that.. looking at the changelog they may have changed it to a new rope method :') it used to be a regular rope with short factor and long factor, now it's their new longrope...

joshuaturner

Jul 2, 2024

All the more important to distinguish between the versions, then.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment