if load image or video the response are unreadable word and charters
Hi , π
I download below files to my pc
model : MiMo-VL-7B-RL-2508.Q4_K_M.gguf
projector : MiMo-VL-7B-RL-2508.mmproj-f16.gguf
then run last version of llama.cpp : 6139 with below command
.\llama\llama-server -m .\Models\Xiaomi\MiMo-VL-7B-RL-2508.Q4_K_M.gguf --mmproj .\Models\Xiaomi\MiMo-VL-7B-RL-2508.mmproj-f16.gguf
response are ok for text based query and thinking in good , but π€·ββοΈ
if load image or video the response are unreadable word and charters like in screen shoot below
I got the exact same thing using the F16 version of the GGUF, only my output was nothing but the letter "G" repeating.
SOLVED: Go grab unsloth's mmproj-F32.gguf
from here. https://huggingface.co/unsloth/MiMo-VL-7B-RL-GGUF/ and it will work just fine.