Something wrong with the quants?

#1
by Sir-Dan - opened

Hi, I can't load those neither in LM Studio, nor in latest ollama.

"llama.cpp error: 'check_tensor_dims: tensor 'rope_freqs.weight' has wrong shape; expected 48, got 64, 1, 1, 1'"

which of the quants? I'll check it

I tried Q8 and Q6_K. Thanks!

tested Q8 on koboldcpp + ooba, both work flawlessly.
ollama might be sensitive to tokenizer mismatch, or a problem with nvidia's nemotron base.

in both cases, it's an ollama issue, as it works in all other front ends.
meanwhile you can try other quants.

SicariusSicariiStuff changed discussion status to closed

Indeed koboldcpp works.

Sign up or log in to comment