bwshen-mi commited on
Commit
c72df45
Β·
verified Β·
1 Parent(s): 95b45ad

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +45 -6
README.md CHANGED
@@ -11,11 +11,11 @@ library_name: transformers
11
 
12
  <h3 align="center">
13
  <b>
14
- <span>━━━━━━━━━━━━━━━━━━━━━━━━━</span>
15
  <br/>
16
  Unlocking the Reasoning Potential of Language Model<br/>From Pretraining to Posttraining
17
  <br/>
18
- <span>━━━━━━━━━━━━━━━━━━━━━━━━━</span>
19
  <br/>
20
  </b>
21
  </h3>
@@ -35,7 +35,41 @@ library_name: transformers
35
 
36
  <br/>
37
 
38
- > This model repository is licensed under the MIT License.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
 
40
  ## I. Introduction
41
 
@@ -122,7 +156,7 @@ MiMo-7B series
122
 
123
  ### SGLang Inference
124
 
125
- Thanks to the [contribution](https://github.com/sgl-project/sglang/pull/5921) from the SGLang team, we supported MiMo in SGLang mainstream within 24h with MTP coming soon.
126
 
127
  Example Script
128
 
@@ -132,9 +166,14 @@ python3 -m uv pip install "sglang[all] @ git+https://github.com/sgl-project/sgla
132
 
133
  # Launch SGLang Server
134
  python3 -m sglang.launch_server --model-path XiaomiMiMo/MiMo-7B-Base --host 0.0.0.0 --trust-remote-code
 
 
 
 
 
135
  ```
136
 
137
- Detailed usage can be found in [SGLang documents](https://docs.sglang.ai/backend/send_request.html). MTP will also be supported in 24h.
138
 
139
  ### vLLM inference
140
 
@@ -223,7 +262,7 @@ print(tokenizer.decode(output.tolist()[0]))
223
  ```bibtex
224
  @misc{coreteam2025mimounlockingreasoningpotential,
225
  title={MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining},
226
- author={{Xiaomi LLM-Core Team}},
227
  year={2025},
228
  eprint={2505.07608},
229
  archivePrefix={arXiv},
 
11
 
12
  <h3 align="center">
13
  <b>
14
+ <span>━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━</span>
15
  <br/>
16
  Unlocking the Reasoning Potential of Language Model<br/>From Pretraining to Posttraining
17
  <br/>
18
+ <span>━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━</span>
19
  <br/>
20
  </b>
21
  </h3>
 
35
 
36
  <br/>
37
 
38
+ ---
39
+
40
+ ## Updates
41
+
42
+ [2025.05.30] We scaled the SFT dataset from approximately 500K to 6M instances and continuously expanding the RL training window size from 32K to 48K, the performance of [MiMo-7B-RL-0530](https://huggingface.co/XiaomiMiMo/MiMo-7B-RL-0530) on AIME24 can be continuously improved and eventually surpass that of DeepSeek R1 (79.8).
43
+
44
+ <table>
45
+ <thead>
46
+ <tr>
47
+ <th>Benchmark</th>
48
+ <th>MiMo-7B-RL</th>
49
+ <th>MiMo-7B-RL-0530</th>
50
+ </tr>
51
+ </thead>
52
+ <tbody>
53
+ <tr>
54
+ <td colspan="3"><strong>Mathematics</strong></td>
55
+ <p align="center">
56
+ <td rowspan="11"><img width="80%" src="https://github.com/XiaomiMiMo/MiMo/raw/main/figures/length.jpg?raw=true"></td>
57
+ </p>
58
+ </tr>
59
+ <tr><td>MATH500<br/>(Pass@1)</td><td>95.8</td><td>97.2</td></tr>
60
+ <tr><td>AIME 2024<br/>(Pass@1)</td><td>68.2</td><td>80.1</td></tr>
61
+ <tr><td>AIME 2025<br/>(Pass@1)</td><td>55.4</td><td>70.2</td></tr>
62
+ <tr><td colspan="3"><strong>Code</strong></td></tr>
63
+ <tr><td>LiveCodeBench v5<br/>(Pass@1)</td><td>57.8</td><td>60.9</td></tr>
64
+ <tr><td>LiveCodeBench v6<br/>(Pass@1)</td><td>49.3</td><td>52.2</td></tr>
65
+ <tr><td colspan="3"><strong>STEM</strong></td></tr>
66
+ <tr><td>GPQA-Diamond<br/>(Pass@1)</td><td>54.4</td><td>60.6</td></tr>
67
+ <tr><td colspan="3"><strong>General</strong></td></tr>
68
+ <tr><td>Alignbench1.1<br/>(Evaluated by GPT4.1)</td><td>6.9</td><td>7.4</td></tr>
69
+ </tbody>
70
+ </table>
71
+
72
+ ---
73
 
74
  ## I. Introduction
75
 
 
156
 
157
  ### SGLang Inference
158
 
159
+ Thanks to the [MiMo model support](https://github.com/sgl-project/sglang/pull/5921) and [MTP](https://github.com/sgl-project/sglang/pull/6059) from the SGLang team, we supported MiMo in SGLang mainstream.
160
 
161
  Example Script
162
 
 
166
 
167
  # Launch SGLang Server
168
  python3 -m sglang.launch_server --model-path XiaomiMiMo/MiMo-7B-Base --host 0.0.0.0 --trust-remote-code
169
+
170
+ # Launch MTP Server
171
+ python3 -m sglang.launch_server --model-path XiaomiMiMo/MiMo-7B-Base --trust-remote-code \
172
+ --speculative-algorithm EAGLE --speculative-num-steps 1 --speculative-eagle-topk 1 \
173
+ --speculative-num-draft-tokens 2 --mem-fraction 0.5
174
  ```
175
 
176
+ Detailed usage can be found in [SGLang documents](https://docs.sglang.ai/backend/send_request.html).
177
 
178
  ### vLLM inference
179
 
 
262
  ```bibtex
263
  @misc{coreteam2025mimounlockingreasoningpotential,
264
  title={MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining},
265
+ author={LLM-Core-Team Xiaomi},
266
  year={2025},
267
  eprint={2505.07608},
268
  archivePrefix={arXiv},