littlebird13 commited on
Commit
a0809d1
·
verified ·
1 Parent(s): e8bbd82

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -4
README.md CHANGED
@@ -101,13 +101,11 @@ For deployment, you can use `sglang>=0.4.6.post1` or `vllm>=0.8.4` or to create
101
 
102
  For local use, applications such as llama.cpp, Ollama, LMStudio, and MLX-LM have also supported Qwen3.
103
 
104
-
105
-
106
  ## Switching Between Thinking and Non-Thinking Mode
107
 
108
  > [!TIP]
109
- > The `enable_thinking` switch is also available in APIs created by vLLM and SGLang.
110
- > Please refer to our documentation for [vLLM](https://qwen.readthedocs.io/en/latest/deployment/vllm.html#thinking-non-thinking-modes) and [SGLang](https://qwen.readthedocs.io/en/latest/deployment/sglang.html#thinking-non-thinking-modes) users.
111
 
112
  ### `enable_thinking=True`
113
 
@@ -127,6 +125,7 @@ In this mode, the model will generate think content wrapped in a `<think>...</th
127
  > [!NOTE]
128
  > For thinking mode, use `Temperature=0.6`, `TopP=0.95`, `TopK=20`, and `MinP=0` (the default setting in `generation_config.json`). **DO NOT use greedy decoding**, as it can lead to performance degradation and endless repetitions. For more detailed guidance, please refer to the [Best Practices](#best-practices) section.
129
 
 
130
  ### `enable_thinking=False`
131
 
132
  We provide a hard switch to strictly disable the model's thinking behavior, aligning its functionality with the previous Qwen2.5-Instruct models. This mode is particularly useful in scenarios where disabling thinking is essential for enhancing efficiency.
 
101
 
102
  For local use, applications such as llama.cpp, Ollama, LMStudio, and MLX-LM have also supported Qwen3.
103
 
 
 
104
  ## Switching Between Thinking and Non-Thinking Mode
105
 
106
  > [!TIP]
107
+ > The `enable_thinking` switch is also available in APIs created by SGLang and vLLM.
108
+ > Please refer to our documentation for [SGLang](https://qwen.readthedocs.io/en/latest/deployment/sglang.html#thinking-non-thinking-modes) and [vLLM](https://qwen.readthedocs.io/en/latest/deployment/vllm.html#thinking-non-thinking-modes) users.
109
 
110
  ### `enable_thinking=True`
111
 
 
125
  > [!NOTE]
126
  > For thinking mode, use `Temperature=0.6`, `TopP=0.95`, `TopK=20`, and `MinP=0` (the default setting in `generation_config.json`). **DO NOT use greedy decoding**, as it can lead to performance degradation and endless repetitions. For more detailed guidance, please refer to the [Best Practices](#best-practices) section.
127
 
128
+
129
  ### `enable_thinking=False`
130
 
131
  We provide a hard switch to strictly disable the model's thinking behavior, aligning its functionality with the previous Qwen2.5-Instruct models. This mode is particularly useful in scenarios where disabling thinking is essential for enhancing efficiency.