VLLM fails to serve
#2
by
dinerburger
- opened
Getting the following error:
VLLM_USE_V1=0 vllm serve /opt/models/textgen/gemma-3-27b-it-unsloth-bnb-4bit --max-model-len 32768 --port 9999
ERROR 04-03 14:27:06 [engine.py:448] KeyError: 'layers.47.mlp.down_proj.weight.absmax'
This is listen in the safe tensors index, but it's having problems resolving it for whatever reason.
Forgot to pass --quantization parameter, sorry.
dinerburger
changed discussion status to
closed