Model only output !!!!! when using with vllm
#22
by
Jacqkues
- opened
I am trying to run the code for inference with vllm on a T4 gpu . But for one page it take very long time and the output are only ! token
Jacqkues
changed discussion status to
closed
Jacqkues
changed discussion status to
open
Hello @Jacqkues ,
Could you please try to use:
vllm==0.10.1.1
transformers==4.55.2
torch==2.7.1
torchvision==0.22.1
Still have only !!! token in the output file using the code provided in the model card for vllm
I have fixed the issue by adding --dtype float32 to the vllm command line
@Jacqkues thanks for figuring this out, I created a Troubleshooting section on the README. Closing here.
auerchristoph
changed discussion status to
closed