unsloth/gpt-oss-20b-GGUF · Problems with FP32 model

6 days ago

I'm using Ooba's Web UI (https://github.com/oobabooga/text-generation-webui).

I downloaded both the FP16 and FP32 models. The FP16 version works perfectly, but the FP32 version has problems:

It has not yet been updated with the template fix present in FP16, so starting a chat will crash unless you manually apply the fix
Once the template is fixed, and you start chatting, the output will be nonsense.

For example, if I ask FP32, "Who is George Washington?" the output will be multiple lines of periods:

...?……………………………………………………………………………………………………………………………………………………………………………………………………………......…………………………...…………………………………………………………………………………………………………………………………………………………………………………………...……………………...……………………………………………………………………………………………

I'm just using the default parameter settings of Ooba which work fine for the FP16 model.

Here are the links tot he two models I'm talking about:
https://huggingface.co/unsloth/gpt-oss-20b-GGUF/blob/main/gpt-oss-20b-F16.gguf
https://huggingface.co/unsloth/gpt-oss-20b-GGUF/blob/main/gpt-oss-20b-F32.gguf

I verified both models with SHA256 sums.

Also note that I can run the 6 and 8 bit quants from your GPT OSS 120B GGUF model, and they work perfectly, while being larger than the FP32 model here. So I don't think it's a memory issue.

For reference, I'm talking about these two (which work fine):
https://huggingface.co/unsloth/gpt-oss-120b-GGUF/tree/main/UD-Q6_K_XL
https://huggingface.co/unsloth/gpt-oss-120b-GGUF/tree/main/UD-Q8_K_XL

System: Intel 13900K, 128GB RAM, 5060 Ti 16GB + 4060 Ti 16GB.

shimmyshimmer

Unsloth AI org 1 day ago

Thanks will investigate. For now we'll delete it as it's the only version we did not update

YardWeasel

about 22 hours ago

OK, I hope to see it again. I wasn't trying to get it deleted.