Mmproj files too small?

#1
by jmi1234 - opened

Hey! thank you for all the hard work! I tried to load the model but got an error loading the mmproj files and noticed they are 1.54 kB of size, is this intended?

If you don't care about multimodal inputs, you can add --no-mmproj to the llama-server args to skip loading these completely.

Anyways, even with that, the quants seems only produce sequences of :::::...

Thanks, I did know about no-mmproj but I do care about multimodal and wanted to notify that the files may not be working currently.

I tried it without using multimodal and it does produce results for me though with the following arguments:

-model /var/lib/gpustack-worker-nvidia/cogito-v2/cogito-v2-preview-llama-109B-MoE-Q4_K_S-00001-of-00002.gguf --alias unsloth/cogito-v2-preview-llama-109B-MoE-GGUF --no-mmap --no-warmup --temp 0.7 --jinja --ctx-size 32768 --threads 32 --top-p 0.8 --min-p 0.05 --top-k 20 --no-mmproj --ngl 22 --parallel 1

Sign up or log in to comment