How to disable thinking?

by kurnevsky - opened Apr 28

Apr 28

Original model card mentions enable_thinking parameter passed to the tokenizer. Any idea what it does and how to emulate it with llama-cpp?

kurnevsky

Apr 28

It seems it's encoded in the chat template, so passing custom chat template should work, but also llama-cpp fails to parse it with --jinja arg, so it would need some simplification.

CronoBJS

Apr 28

/no_think

bartowski

Owner Apr 28

Yes, I've found adding /no_think at the end of my prompt works perfectly

kurnevsky

Apr 30

Here is the proper way to do this: https://github.com/ggml-org/llama.cpp/issues/13178#issuecomment-2839416968

kurnevsky changed discussion status to closed Apr 30

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment