Tokenizer or template bug

#1
by beijinghouse - opened

Q4_K_XL quant of MedGemma-27-text-it-GUFF In llama.cpp

initial response begins:

<unused94>thought
Here's a breakdown of the thinking process...

Unsloth AI org

Q4_K_XL quant of MedGemma-27-text-it-GUFF In llama.cpp

initial response begins:

<unused94>thought
Here's a breakdown of the thinking process...

Hi there I tried it in llama.cpp and the error doesn't occur. Do you know if it's specifically for the Q4 XL quant?

Yes 100% Unsloth Q4 XL

Llama.cpp b5423. Didn't modify sampler settings. Occurred very first attempt to use model so assumed it would be easy to reproduce. Prompt was something like "describe all medications that can be used to treat X".

Sign up or log in to comment