Problem with Quants

#2
by Clevyby - opened

Hey, noob here, it appears that the quants you made don't work. I used koboldcpp's official colab and well, it outputted this error:

error loading model: create_tensor: tensor 'output_norm.weight' not found

Bluenipples quant of this worked fine, though, correct me if I'm wrong.

Owner

Hello, dont have knowledge about koboldcpp but have to try it. Works fine from llamacpp

I suppose you can check out koboldcpp github page for more info. I myself don't use llamacpp as koboldcpp is more user friendly, and I exclusively use colab (specifically the one that uses koboldcpp) since I don't have any capable hardware. Well, your DaringLotus gguf quant worked, dunno what you did differently here.

Yeah, these do actually look the wrong file sizes. q8 should be about 11gb for eg. Check these file sizes for point of comparison (same model size):

https://huggingface.co/NyxKrage/FrostMaid-10.7B-TESTING-GGUF/tree/main

Your 4km is smaller than my 3km I believe:

https://huggingface.co/BlueNipples/DaringLotus-SnowLotus-10.7b-IQ-GGUF/tree/main

File size is a function of model size and bits per weight, so that shouldn't happen.

Owner

Yup, something did not work cause inference in model size seems to be empty, working on it rn, ill let you know,

Yup, something did not work cause inference in model size seems to be empty, working on it rn, ill let you know,

Sweet. Only caught this because we were talking about quant size on discord and I looked. All good, I'm sure you'll fix it! Nice thing to have done in any case.

Sign up or log in to comment