Great job

#1
by ykarout - opened

Thanks for this! It was in my pipeline and you saved me :)
Quick question, I used the gguf-my-repo space hosted on HF to convert a test Q8_0 quant, but it did not actually inject the chat template into the GGUF file, so when I tested through LMStudio which usually automatically extracts the chat template, it was not found, I had to manually add it from the original ninja file.
How did you manage to inject the chat template into all the quants? I know llama.cpp conversion usually handles this automatically in recent versions but I wondered why it was missing in the quant I created using the gguf-my-repo space: https://huggingface.co/spaces/ggml-org/gguf-my-repo

I have no idea, we don't have special magic to add a chat template, we also just use llama.cpp, i.e. convert_hf_to_gguf.py

oh seems the like a bug from the gguf space… maybe outdated llama.pp version

Sign up or log in to comment