Pixtral-12B-2409: int4 Weight Quant

vision_tower kept at FP16. language_model weights quantized to 4bit.

Calibrated on 512 flickr samples.

Example VLLM usage

vllm serve nintwentydo/pixtral-12b-2409-W4A16-G128 --max-model-len 131072 --limit-mm-per-prompt 'image=4'

If you want a more advanced/fully featured chat template you can use this jinja template

Safetensors

Model size

3.23B params

Tensor type

I64

I32

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for nintwentydo/pixtral-12b-2409-W4A16-G128

Base model

Quantized

(5)

this model