Qwen3-30B-A3B-128K-UD-Q4_K_XL will not work with draft models
#3
by
smcleod
- opened
I'm not 100% sure if it's the UD quantisation or the 128K change but Qwen3-30B-A3B-128K-UD-Q4_K_XL.gguf will not work smaller draft models (e.g. Qwen3-0.6B-Q6_K and Qwen3-0.6B-UD-Q4_K_XL):
load_model: the draft model '/models/Qwen3-0.6B-UD-Q4_K_XL.gguf' is not compatible with the target model '/models/Qwen3-32B-128K-UD-Q6_K_XL.gguf'
Note that the Qwen3-30B-A3B-128K-UD-Q4_K_XL.gguf
model does work - which is what has got me thinking it's either specific to the 32B or 32B-UD-128K variant.
I'm in the process of downloading the non-128k versions to test further.
DOES work with Qwen3-0.6B-Q6_K.gguf and Qwen3-0.6B-UD-Q4_K_XL.gguf draft
- Qwen3-30B-A3B-128K-UD-Q4_K_XL.gguf
- Qwen3-32B-UD-Q4_K_XL.gguf
Does NOT work with Qwen3-0.6B-Q6_K.gguf or Qwen3-0.6B-UD-Q4_K_XL.gguf draft
- Qwen3-32B-128K-UD-Q6_K_XL.gguf
- Qwen3-8B-128K-UD-Q6_K_XL.gguf