run with "vllm serve" comes to an error: KeyError: 'layers.0.mlp.down_proj.weight.absmax'

by ly0303521 - opened 4 days ago

4 days ago

my command:
vllm serve --max-model-len=1000 --gpu-memory-utilization=0.5 unsloth/orpheus-3b-0.1-ft-unsloth-bnb-4bit
KeyError:
'layers.0.mlp.down_proj.weight.absmax'
How can use this model with vllm?

ly0303521

4 days ago

run with OrpheusModel also comes the same error
model = OrpheusModel(model_name ="unsloth/orpheus-3b-0.1-ft-unsloth-bnb-4bit")

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment