Qwen3-30B-A3B-128K-UD-Q4_K_XL will not work with draft models

#3
by smcleod - opened

I'm not 100% sure if it's the UD quantisation or the 128K change but Qwen3-30B-A3B-128K-UD-Q4_K_XL.gguf will not work smaller draft models (e.g. Qwen3-0.6B-Q6_K and Qwen3-0.6B-UD-Q4_K_XL):

load_model: the draft model '/models/Qwen3-0.6B-UD-Q4_K_XL.gguf' is not compatible with the target model '/models/Qwen3-32B-128K-UD-Q6_K_XL.gguf'

Note that the Qwen3-30B-A3B-128K-UD-Q4_K_XL.gguf model does work - which is what has got me thinking it's either specific to the 32B or 32B-UD-128K variant.

I'm in the process of downloading the non-128k versions to test further.

DOES work with Qwen3-0.6B-Q6_K.gguf and Qwen3-0.6B-UD-Q4_K_XL.gguf draft

  • Qwen3-30B-A3B-128K-UD-Q4_K_XL.gguf
  • Qwen3-32B-UD-Q4_K_XL.gguf

Does NOT work with Qwen3-0.6B-Q6_K.gguf or Qwen3-0.6B-UD-Q4_K_XL.gguf draft

  • Qwen3-32B-128K-UD-Q6_K_XL.gguf
  • Qwen3-8B-128K-UD-Q6_K_XL.gguf

Sign up or log in to comment