VLLM fails to serve

by dinerburger - opened Apr 3

Apr 3

Getting the following error:

VLLM_USE_V1=0 vllm serve /opt/models/textgen/gemma-3-27b-it-unsloth-bnb-4bit --max-model-len 32768 --port 9999

ERROR 04-03 14:27:06 [engine.py:448] KeyError: 'layers.47.mlp.down_proj.weight.absmax'

This is listen in the safe tensors index, but it's having problems resolving it for whatever reason.

Apr 3

Forgot to pass --quantization parameter, sorry.

dinerburger changed discussion status to closed Apr 3

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment