mradermacher/model_requests · Would like some assistance, Model won't run.

10 days ago

No quants needed just some help.

Need help with this model, Made a quant with the gguf repo and it doesn't work. I assume something is wrong with the model. Is there any way to debug this or replace some parts with those of the base model, Or is my merge config to blame.

Link to model: https://huggingface.co/Vortex5/tobenamed-24B

nicoboss

10 days ago

It's queued! :D
Let's just see how it turns out. If it is broken maybe I see why when tryining it.

You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#tobenamed-24B-GGUF for quants to appear.

nicoboss

9 days ago

@Vortex5 The dryrun check we implemented inside ouer llama.cpp fork detected the following error:

load_tensors: loading model tensors, this can take a while... (mmap = false)
llama_model_load: error loading model: check_tensor_dims: tensor 'token_embd.weight' has wrong shape; expected  5120, 131074, got  5120, 131072,     1,     1
llama_model_load_from_file_impl: failed to load model
common_init_from_params: failed to load model 'tobenamed-24B.gguf~'
main: error: unable to load model
dryrun failed

The only line of interest is:

llama_model_load: error loading model: check_tensor_dims: tensor 'token_embd.weight' has wrong shape; expected  5120, 131074, got  5120, 131072,     1,     1

So the issue with your model is that the token_embd.weight has the wrong shape. It is 5120, 131072, 1, 1 but should be 5120, 131074. No idea how this could have happened. It's only off by 2 and somehow has 2 additinal 1-sized dimennsions.

Vortex5

9 days ago

•

edited 9 days ago

I think the merge might have been ended too soon causing error, thanks for helping. I am going to merge it again, I don't feel like setting up llama to fix.

mradermacher

Owner 9 days ago

Most commonly this is a vocabulary mismatch, i.e. the weights are from a 131072 vocab model, but the vocab has increased in size to 131074