Q/prob scaling factors error with vllm
#4
by
inversemod
- opened
WARNING 05-03 17:40:35 [kv_cache.py:128] Using Q scale 1.0 and prob scale 1.0 with fp8 attention. This may cause accuracy issues. Please make sure Q/prob scaling factors are available in the fp8 checkpoint.