Something wrong with the quants?

by Sir-Dan - opened 15 days ago

15 days ago

Hi, I can't load those neither in LM Studio, nor in latest ollama.

"llama.cpp error: 'check_tensor_dims: tensor 'rope_freqs.weight' has wrong shape; expected 48, got 64, 1, 1, 1'"

Owner 15 days ago

which of the quants? I'll check it

15 days ago

I tried Q8 and Q6_K. Thanks!

Owner 15 days ago

tested Q8 on koboldcpp + ooba, both work flawlessly.
ollama might be sensitive to tokenizer mismatch, or a problem with nvidia's nemotron base.

in both cases, it's an ollama issue, as it works in all other front ends.
meanwhile you can try other quants.

SicariusSicariiStuff changed discussion status to closed 15 days ago

15 days ago

Indeed koboldcpp works.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment