It seems GGUF is not compatible with CPU only on VLLM... Furthermore, llama.cpp can run GGUF model but wil not provide comptability with multimodal with this model...
Any suggestion for using the multi modal model quantized ? (I run on old CPU only..)