How to control thinking length?

#24

by lidh15 - opened May 13

May 13

I do want to use your fantastic thinking mode, however, it is a little bit long.
how can we limit the thinking under a max token, for example, at most 256 tokens for thinking part?

zhusl-cpu

30 days ago

I think you can only implement it by generating twice. First with limited output length and end token = , then add to truncated outputs and inference againt

biliuta

29 days ago

•

edited 29 days ago

I noticed the length of "think" output is not the same if you run it multiple times with the same input and same configuration, sometimes having big variations. Is there a way to at least tell the model to think more so that the thought process is longer?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment