How did you quantize

#1
by ctranslate2-4you - opened

Would you mind sharing the script that you used to quantize this model? I've had extreme difficulty quantizing these mixed vision/chat models with bitsandbytes.

Hi there. You can find my quantization process as listed in the model card. I have tested that the quantized model safesensors have preserved the original vision reasoning capabilities. The quantization was done on a single 16GB NVIDIA graphics card using less than 8 GB VRAM. I haven't try the same quantization method on other multi-models but you may give it a try.

Sign up or log in to comment