How to create a GGUF file for LLaMA 3.2 Vision model (MllamaForConditionalGeneration)?
I am trying to convert the LLaMA 3.2 11B Vision Instruct model (MllamaForConditionalGeneration) into the GGUF format to use it with llama.cpp.
I have attempted to use the official convert_hf_to_gguf.py script from llama.cpp, but it currently throws an error saying that the model architecture is not supported. I also tried using convert_hf_to_gguf_update.py, but that script is intended only for tokenizer updates.
Given the recent addition of vision capabilities in LLaMA 3.2, I would greatly appreciate guidance or examples on how to properly convert this model into GGUF format, handling the vision projection layers and tokenizer correctly.
Specifically, I would appreciate help with:
Any updated or specialized conversion scripts that support MllamaForConditionalGeneration.
Recommended best practices or workarounds for export and conversion.
Any pointers to community or official tools that can handle vision model conversion successfully.
Thank you very much for your support!
