About the non-thinking mode

#14

by volcanos - opened Apr 29

Apr 29

Great work!

I've also been doing this recently, training a model to think and not think at the same time, but when I inserted a certain proportion of <think>\n\n<think> data, I found that the model performance dropped seriously.
Is qwen's approach to directly insert <think>\n\n<think> after the user question during inference? Is it good to use only sft training, or is it necessary to use the last step RL.

volcanos

Apr 29

And I want to know, how to let the model follow the thinking budget? I can not find the method at the blog.

Mar2ck

Apr 29

•

edited Apr 30

Shouldn't it be <think>\n\n</think>?

Edit: The full assistant start string should be <|im_start|>assistant\n<think>\n\n</think>\n\n as per Qwen3 Github Issue #1286

volcanos changed discussion status to closed 22 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment