unused tensors?

#2
by jacek2024 - opened

I am trying to use GLM-4.5-Air-UD-Q4_K_XL-00001-of-00002.gguf

model has unused tensor blk.46.attn_norm.weight (size = 16384 bytes) -- ignoring
model has unused tensor blk.46.attn_q.weight (size = 28311552 bytes) -- ignoring
model has unused tensor blk.46.attn_k.weight (size = 2359296 bytes) -- ignoring
model has unused tensor blk.46.attn_v.weight (size = 2359296 bytes) -- ignoring
model has unused tensor blk.46.attn_q.bias (size = 49152 bytes) -- ignoring
model has unused tensor blk.46.attn_k.bias (size = 4096 bytes) -- ignoring
model has unused tensor blk.46.attn_v.bias (size = 4096 bytes) -- ignoring
model has unused tensor blk.46.attn_output.weight (size = 28311552 bytes) -- ignoring
model has unused tensor blk.46.post_attention_norm.weight (size = 16384 bytes) -- ignoring
model has unused tensor blk.46.ffn_gate_inp.weight (size = 2097152 bytes) -- ignoring
model has unused tensor blk.46.exp_probs_b.bias (size = 512 bytes) -- ignoring
model has unused tensor blk.46.ffn_gate_exps.weight (size = 415236096 bytes) -- ignoring
model has unused tensor blk.46.ffn_down_exps.weight (size = 784334848 bytes) -- ignoring
model has unused tensor blk.46.ffn_up_exps.weight (size = 415236096 bytes) -- ignoring
model has unused tensor blk.46.ffn_gate_shexp.weight (size = 3244032 bytes) -- ignoring
model has unused tensor blk.46.ffn_down_shexp.weight (size = 6127616 bytes) -- ignoring
model has unused tensor blk.46.ffn_up_shexp.weight (size = 3244032 bytes) -- ignoring
model has unused tensor blk.46.nextn.eh_proj.weight (size = 18874368 bytes) -- ignoring
model has unused tensor blk.46.nextn.embed_tokens.weight (size = 349175808 bytes) -- ignoring
model has unused tensor blk.46.nextn.enorm.weight (size = 16384 bytes) -- ignoring
model has unused tensor blk.46.nextn.hnorm.weight (size = 16384 bytes) -- ignoring
model has unused tensor blk.46.nextn.shared_head_head.weight (size = 349175808 bytes) -- ignoring
model has unused tensor blk.46.nextn.shared_head_norm.weight (size = 16384 bytes) -- ignoring

any hints?

As far as I know, in the application process of llama.cpp, it was decided not to use the last layer because the last layer is related to MTP, which is not implemented in llama.cpp.

ah so it's normal, thanks

Sign up or log in to comment