How to disable thinking?

#1
by kurnevsky - opened

Original model card mentions enable_thinking parameter passed to the tokenizer. Any idea what it does and how to emulate it with llama-cpp?

It seems it's encoded in the chat template, so passing custom chat template should work, but also llama-cpp fails to parse it with --jinja arg, so it would need some simplification.

/no_think

Yes, I've found adding /no_think at the end of my prompt works perfectly

kurnevsky changed discussion status to closed

Sign up or log in to comment