How to disable <think> tag on stream mode request?

#4
by celsowm - opened

Hi !
I tried to include "enable_thinking" on request body calling a llama-server using "enable_thinking" but no effect:

const requestBody = {
                messages: [
                    {
                        role: 'user',
                        content: 'who are you?',
                    }
                ],
                temperature: 1.0,
                top_p: 1.0,
                model: 'unsloth/Qwen3-4B-GGUF',
                enable_thinking: false,
                stream: true,
            };

Even when using softblock "/no_think" still returning an empty tag
Any hint?

This is as expected. Dealing with CoT and empty thinking is the client's responsibility.

celsowm changed discussion status to closed

Sign up or log in to comment