Tokenizer or template bug
#1
by
beijinghouse
- opened
Q4_K_XL quant of MedGemma-27-text-it-GUFF In llama.cpp
initial response begins:
<unused94>thought
Here's a breakdown of the thinking process...
Q4_K_XL quant of MedGemma-27-text-it-GUFF In llama.cpp
initial response begins:
<unused94>thought
Here's a breakdown of the thinking process...
Hi there I tried it in llama.cpp and the error doesn't occur. Do you know if it's specifically for the Q4 XL quant?
Yes 100% Unsloth Q4 XL
Llama.cpp b5423. Didn't modify sampler settings. Occurred very first attempt to use model so assumed it would be easy to reproduce. Prompt was something like "describe all medications that can be used to treat X".