Mmproj files too small?

by jmi1234 - opened 24 days ago

24 days ago

Hey! thank you for all the hard work! I tried to load the model but got an error loading the mmproj files and noticed they are 1.54 kB of size, is this intended?

av-codes

23 days ago

If you don't care about multimodal inputs, you can add --no-mmproj to the llama-server args to skip loading these completely.

Anyways, even with that, the quants seems only produce sequences of :::::...

jmi1234

23 days ago

Thanks, I did know about no-mmproj but I do care about multimodal and wanted to notify that the files may not be working currently.

jmi1234

23 days ago

I tried it without using multimodal and it does produce results for me though with the following arguments:

-model /var/lib/gpustack-worker-nvidia/cogito-v2/cogito-v2-preview-llama-109B-MoE-Q4_K_S-00001-of-00002.gguf --alias unsloth/cogito-v2-preview-llama-109B-MoE-GGUF --no-mmap --no-warmup --temp 0.7 --jinja --ctx-size 32768 --threads 32 --top-p 0.8 --min-p 0.05 --top-k 20 --no-mmproj --ngl 22 --parallel 1

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment