THU-KEG/LongWriter-Zero-32B · It doesn't work with llama.cpp

JLouisBiz

2 days ago

With llama.cpp it seems like it only knows its prompt, it doesn't get user's prompt.

bys0318

Knowledge Engineer Group @ Tsinghua University org 2 days ago

Hey there, can you try https://huggingface.co/bartowski/THU-KEG_LongWriter-Zero-32B-GGUF?

mingyi456

about 20 hours ago

@bys0318 I tried bartowski's IQ4_XS quant of this model in LM Studio, and asked it to write a story. It seems to produce its initial thought process outside of a thinking block, then produces a thinking block which contains further thinking, before going on to write the story. This behavior is rather weird, because I would expect it to put all its thoughts inside the thinking block. Also, I needed to add a stop string, or else it would not stop. Is there something wrong with the chat template, or is it llama.cpp / LM Studio's implementation of the chat template?

mozhu

Knowledge Engineer Group @ Tsinghua University org about 20 hours ago

Hi! We haven't tested the usage with llama.cpp😢, so it's possible that there may indeed be some misalignment. You might want to try referring to the usage information provided on our model card page, particularly the Def format_prompt_with_template(prompt) function and stop_strings, especially "</answer>" . Aligning with these may help resolve the issue.