run with "vllm serve" comes to an error: KeyError: 'layers.0.mlp.down_proj.weight.absmax'

#1
by ly0303521 - opened

my command:
vllm serve --max-model-len=1000 --gpu-memory-utilization=0.5 unsloth/orpheus-3b-0.1-ft-unsloth-bnb-4bit
KeyError:
'layers.0.mlp.down_proj.weight.absmax'
How can use this model with vllm?

run with OrpheusModel also comes the same error
model = OrpheusModel(model_name ="unsloth/orpheus-3b-0.1-ft-unsloth-bnb-4bit")

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment