no think tokens generated by model (only chat template)
How can I parse the reasoning when there is no token generated by the model?
The model's generated output (the content field value) starts directly with the reasoning ("Okay, so the problem is asking...") without outputting a "think" tag itself. The "think" tag is only part of the prompt constructed by the template (conversation.py / tokenizer_config.json), not part of the text the model generates.
Critically, the model does not generate a "/think" tag which means that there is no delimiter token that can be used to identify when the thinking finishes and transitions to the actual response.
The result is the thinking text and final response are merged together and not seperated. Do you have a suggestion on how to solve this? (or maybe you are working on R1V3 using Qwen3-32B? :-] )
Hi, thanks for raising this! Actually, the model does generate the "/think" tag in its prediction to indicate the end of the reasoning process. The "think" tag is indeed part of the chat template defined in tokenizer_config.json or conversation.py, so it’s prepended during prompt construction. However, the "/think" is expected to be generated by the model itself, serving as a clear delimiter between the internal reasoning and the final answer.