Not running ond vllm / transformer

#3
by abiteddie - opened

sing a slow image processor as use_fast is unset and a slow processor was saved with this model. use_fast=True will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with use_fast=False.
Unknown quantization type, got awq_marlin - supported types are: ['awq', 'bitsandbytes_4bit', 'bitsandbytes_8bit', 'gptq', 'aqlm', 'quanto', 'quark', 'fp_quant', 'eetq', 'higgs', 'hqq', 'compressed-tensors', 'fbgemm_fp8', 'torchao', 'bitnet', 'vptq', 'spqr', 'fp8', 'auto-round', 'mxfp4']. Hence, we will skip the quantization. To remove the warning, you can delete the quantization_config attribute in config.json

What is awq_marlin ?

Sign up or log in to comment