Model only output !!!!! when using with vllm

#22

by Jacqkues - opened 17 days ago

17 days ago

•

I am trying to run the code for inference with vllm on a T4 gpu . But for one page it take very long time and the output are only ! token

Jacqkues changed discussion status to closed 17 days ago

Jacqkues changed discussion status to open 17 days ago

IBM Granite org 17 days ago

Could you please try to use:

vllm==0.10.1.1
transformers==4.55.2
torch==2.7.1
torchvision==0.22.1

17 days ago

•

Still have only !!! token in the output file using the code provided in the model card for vllm

IBM Granite org 17 days ago

@Jacqkues could you provide the input image you are using?

17 days ago

When I try it with the gradio demo running on my vm it's working but not with vllm

17 days ago

I have fixed the issue by adding --dtype float32 to the vllm command line

IBM Granite org 16 days ago

@Jacqkues thanks for figuring this out, I created a Troubleshooting section on the README. Closing here.

auerchristoph changed discussion status to closed 16 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment