shimmyshimmer commited on
Commit
0812877
·
verified ·
1 Parent(s): 8c1d837

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -15
README.md CHANGED
@@ -32,25 +32,16 @@ library_name: transformers
32
  <h1 style="margin-top:0rem; margin-bottom: 0rem;">🐋 DeepSeek-R1-0528-Qwen3-8B Usage Guidelines</h1>
33
  </div>
34
 
35
- | Setting | Non-Thinking Mode | Thinking Mode |
36
- |---------------|-------------------|----------------|
37
- | Temperature | 0.7 | 0.6 |
38
- | Min_P | 0.0 | 0.0 |
39
- | Top_P | 0.8 | 0.95 |
40
- | TopK | 20 | 20 |
41
-
42
- <h4 style="margin-top:0rem;">Chat template/prompt format:</h4>
43
-
44
  ```
45
- <|im_start|>user\nWhat is 2+2?<|im_end|>\n<|im_start|>assistant\n
46
  ```
47
- - For NON thinking mode, we purposely enclose <think> and </think> with nothing:
48
-
49
  ```
50
- <|im_start|>user\nWhat is 2+2?<|im_end|>\n<|im_start|>assistant\n<think>\n\n</think>\n\n
51
  ```
52
- - For Thinking-mode, DO NOT use greedy decoding, as it can lead to performance degradation and endless repetitions.
53
-
54
  - For complete detailed instructions, see our guide: [unsloth.ai/blog/deepseek-r1-0528](https://docs.unsloth.ai/basics/deepseek-r1-0528-how-to-run-locally)
55
 
56
  ---
 
32
  <h1 style="margin-top:0rem; margin-bottom: 0rem;">🐋 DeepSeek-R1-0528-Qwen3-8B Usage Guidelines</h1>
33
  </div>
34
 
35
+ - Set the temperature between **0.5–0.7 (0.6 recommended)** to reduce repetition and incoherence.
36
+ - Set Top_P value of **0.95 (recommended)**
37
+ - R1-0528 uses the same chat template as the original R1 model:
 
 
 
 
 
 
38
  ```
39
+ <|begin▁of▁sentence|><|User|>What is 1+1?<|Assistant|>It's 2.<|end▁of▁sentence|><|User|>Explain more!<|Assistant|>
40
  ```
41
+ - For llama.cpp / GGUF inference, you should skip the BOS since it’ll auto add it:
 
42
  ```
43
+ <|User|>What is 1+1?<|Assistant|>
44
  ```
 
 
45
  - For complete detailed instructions, see our guide: [unsloth.ai/blog/deepseek-r1-0528](https://docs.unsloth.ai/basics/deepseek-r1-0528-how-to-run-locally)
46
 
47
  ---