Getting tensor 'blk.0.ffn_down_exps.weight' has invalid ggml type 39 (NONE) error when trying to load the model in llama.cpp

#4
by KernelDebugger - opened

Hello. Trying to load the model in llama.cpp (cli and server) and getting this error:

srv load_model: loading model '../../../gpt-oss-20b-F16.gguf'
llama_model_load_from_file_impl: using device CUDA0 (NVIDIA GeForce RTX 4070 Ti SUPER) - 15429 MiB free
gguf_init_from_file_impl: tensor 'blk.0.ffn_down_exps.weight' has invalid ggml type 39 (NONE)
gguf_init_from_file_impl: failed to read tensor info
llama_model_load: error loading model: llama_model_loader: failed to load model from ../../../gpt-oss-20b-F16.gguf
llama_model_load_from_file_impl: failed to load model
common_init_from_params: failed to load model '../../../gpt-oss-20b-F16.gguf'
srv load_model: failed to load model, '../../../gpt-oss-20b-F16.gguf'
srv operator(): operator(): cleaning up before exit...

Also seeing this. Although in the other thread it was mentioned that a new version is coming with fixes regarding chat template. Hopefully that fix is also fixing this? But if not, then this needs attention as well!

I think I’ve downloaded the fixed version. Exactly under the link "This is the new MXFP4_MOE quant renamed to F16 with our chat template fixes! Use GGUFs here."
And it shows this error.

Referring to this comment: https://huggingface.co/unsloth/gpt-oss-20b-GGUF/discussions/2#68927da38491c63d06a29dd6

This was posted 20 minutes ago, and the models were uploaded 1 hour ago and the readme was updated like 40 or so minutes ago, so I'm assuming another upload is coming. But I'm not 100% sure.

But, as mentioned, I'm also seeing this error. So I'm just waiting for fixes.

Unsloth AI org

Hello! You guys need to recompile and update llama.cpp!!

@JamesMowery @KernelDebugger

Yeah, just as I thought. Thanks!

Hello! You guys need to recompile and update llama.cpp!!

@JamesMowery @KernelDebugger

I'm on version b6092-1 that was just released. Is there another version or update coming? If so, I'll just wait for the official release to Arch.

Edit: Looks like I do need to wait a bit longer for it to hit Arch. That isn't the latest. Thanks!

Just rebuilt llama.cpp from the source, all works perfectly now, thanks!

KernelDebugger changed discussion status to closed

welp, guess im stuck with not using these models

Sign up or log in to comment