Not running ond vllm / transformer
sing a slow image processor as use_fast
is unset and a slow processor was saved with this model. use_fast=True
will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with use_fast=False
.
Unknown quantization type, got awq_marlin - supported types are: ['awq', 'bitsandbytes_4bit', 'bitsandbytes_8bit', 'gptq', 'aqlm', 'quanto', 'quark', 'fp_quant', 'eetq', 'higgs', 'hqq', 'compressed-tensors', 'fbgemm_fp8', 'torchao', 'bitnet', 'vptq', 'spqr', 'fp8', 'auto-round', 'mxfp4']. Hence, we will skip the quantization. To remove the warning, you can delete the quantization_config attribute in config.json
What is awq_marlin ?