zsytony commited on
Commit
52e5bf8
·
1 Parent(s): 4160138

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +282 -0
README.md CHANGED
@@ -1,3 +1,285 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+
5
+ <div align="center">
6
+
7
+ # MixtralKit
8
+
9
+ A Toolkit for Mixtral Model
10
+
11
+ <br />
12
+ <br />
13
+
14
+ English | [简体中文](README_zh-CN.md)
15
+
16
+ Click [Github](https://github.com/open-compass/MixtralKit) for infernece and evaluation.
17
+
18
+ </div>
19
+
20
+ > Welcome to try [OpenCompass](https://github.com/open-compass/opencompass) for model evaluation, performance of Mixtral will be updated soon.
21
+
22
+ > This repo is an experimental implementation of inference code, which is **not officially released** by Mistral AI.
23
+
24
+
25
+ - [Performance](#performance)
26
+ - [Prepare Model Weights](#prepare-model-weights)
27
+ - [Download Weights](#download-weights)
28
+ - [Merge Files](#merge-filesonly-for-hf)
29
+ - [MD5 Validation](#md5-validation)
30
+ - [Install](#install)
31
+ - [Inference](#inference)
32
+ - [Text Completion](#text-completion)
33
+ - [Evaluation with OpenCompass](#evaluation-with-opencompass)
34
+ - [Step-1: Setup OpenCompass](#step-1-setup-opencompass)
35
+ - [Step-2: Pre-pare evaluation config and weights](#step-2-pre-pare-evaluation-config-and-weights)
36
+ - [Step-3: Run evaluation experiments](#step-3-run-evaluation-experiments)
37
+ - [Acknowledgement](#acknowledgement)
38
+
39
+
40
+ # Performance
41
+
42
+ ## Comparison with Other Models
43
+
44
+ - All data generated from [OpenCompass](https://github.com/open-compass/opencompass)
45
+
46
+ > Performances generated from different evaluation toolkits are different due to the prompts, settings and implementation details.
47
+
48
+
49
+ | Datasets | Mode | Mistral-7B-v0.1 | Mixtral-8x7B | Llama2-70B | DeepSeek-67B-Base | Qwen-72B |
50
+ |-----------------|------|-----------------|--------------|-------------|-------------------|----------|
51
+ | MMLU | PPL | 64.1 | 71.3 | 69.7 | 71.9 | 77.3 |
52
+ | BIG-Bench-Hard | GEN | 56.7 | 67.1 | 64.9 | 71.7 | 63.7 |
53
+ | GSM-8K | GEN | 47.5 | 65.7 | 63.4 | 66.5 | 77.6 |
54
+ | MATH | GEN | 11.3 | 22.7 | 12.0 | 15.9 | 35.1 |
55
+ | HumanEval | GEN | 27.4 | 32.3 | 26.2 | 40.9 | 33.5 |
56
+ | MBPP | GEN | 38.6 | 47.8 | 39.6 | 55.2 | 51.6 |
57
+ | ARC-c | PPL | 74.2 | 85.1 | 78.3 | 86.8 | 92.2 |
58
+ | ARC-e | PPL | 83.6 | 91.4 | 85.9 | 93.7 | 96.8 |
59
+ | CommonSenseQA | PPL | 67.4 | 70.4 | 78.3 | 70.7 | 73.9 |
60
+ | NaturalQuestion | GEN | 24.6 | 29.4 | 34.2 | 29.9 | 27.1 |
61
+ | TrivialQA | GEN | 56.5 | 66.1 | 70.7 | 67.4 | 60.1 |
62
+ | HellaSwag | PPL | 78.9 | 82.0 | 82.3 | 82.3 | 85.4 |
63
+ | PIQA | PPL | 81.6 | 82.9 | 82.5 | 82.6 | 85.2 |
64
+ | SIQA | GEN | 60.2 | 64.3 | 64.8 | 62.6 | 78.2 |
65
+
66
+
67
+ ## Performance Mixtral-8x7b
68
+
69
+ ```markdown
70
+ dataset version metric mode mixtral-8x7b-32k
71
+ -------------------------------------- --------- ------------- ------ ------------------
72
+ mmlu - naive_average ppl 71.34
73
+ ARC-c 2ef631 accuracy ppl 85.08
74
+ ARC-e 2ef631 accuracy ppl 91.36
75
+ BoolQ 314797 accuracy ppl 86.27
76
+ commonsense_qa 5545e2 accuracy ppl 70.43
77
+ triviaqa 2121ce score gen 66.05
78
+ nq 2121ce score gen 29.36
79
+ openbookqa_fact 6aac9e accuracy ppl 85.40
80
+ AX_b 6db806 accuracy ppl 48.28
81
+ AX_g 66caf3 accuracy ppl 48.60
82
+ hellaswag a6e128 accuracy ppl 82.01
83
+ piqa 0cfff2 accuracy ppl 82.86
84
+ siqa e8d8c5 accuracy ppl 64.28
85
+ math 265cce accuracy gen 22.74
86
+ gsm8k 1d7fe4 accuracy gen 65.66
87
+ openai_humaneval a82cae humaneval_pass@1 gen 32.32
88
+ mbpp 1e1056 score gen 47.80
89
+ bbh - naive_average gen 67.14
90
+ ```
91
+
92
+
93
+ # Prepare Model Weights
94
+
95
+ ## Download Weights
96
+ You can download the checkpoints by magnet or huggingface
97
+
98
+
99
+ ### HuggingFace
100
+
101
+ - [mixtral-8x7b-32kseqlen](https://huggingface.co/someone13574/mixtral-8x7b-32kseqlen)
102
+
103
+ > If you are unable to access huggingface, please try [hf-mirror](https://hf-mirror.com/someone13574/mixtral-8x7b-32kseqlen)
104
+
105
+
106
+ ```bash
107
+ # Download the huggingface
108
+ git lfs install
109
+ git clone https://huggingface.co/someone13574/mixtral-8x7b-32kseqlen
110
+
111
+ ```
112
+
113
+ ### Magnet Link
114
+
115
+ Please use this link to download the original files
116
+ ```bash
117
+ magnet:?xt=urn:btih:5546272da9065eddeb6fcd7ffddeef5b75be79a7&dn=mixtral-8x7b-32kseqlen&tr=udp%3A%2F%http://2Fopentracker.i2p.rocks%3A6969%2Fannounce&tr=http%3A%2F%http://2Ftracker.openbittorrent.com%3A80%2Fannounce
118
+ ```
119
+ ## Merge Files(Only for HF)
120
+
121
+ ```bash
122
+
123
+ cd mixtral-8x7b-32kseqlen/
124
+
125
+ # Merge the checkpoints
126
+ cat consolidated.00.pth-split0 consolidated.00.pth-split1 consolidated.00.pth-split2 consolidated.00.pth-split3 consolidated.00.pth-split4 consolidated.00.pth-split5 consolidated.00.pth-split6 consolidated.00.pth-split7 consolidated.00.pth-split8 consolidated.00.pth-split9 consolidated.00.pth-split10 > consolidated.00.pth
127
+ ```
128
+
129
+ ## MD5 Validation
130
+
131
+ Please check the MD5 to make sure the files are completed.
132
+
133
+ ```bash
134
+ md5sum consolidated.00.pth
135
+ md5sum tokenizer.model
136
+
137
+ # Once verified, you can delete the splited files.
138
+ rm consolidated.00.pth-split*
139
+ ```
140
+
141
+ Official MD5
142
+
143
+
144
+ ```bash
145
+ ╓────────────────────────────────────────────────────────────────────────────╖
146
+ ║ ║
147
+ ║ ·· md5sum ·· ║
148
+ ║ ║
149
+ ║ 1faa9bc9b20fcfe81fcd4eb7166a79e6 consolidated.00.pth ║
150
+ ║ 37974873eb68a7ab30c4912fc36264ae tokenizer.model ║
151
+ ╙────────────────────────────────────────────────────────────────────────────╜
152
+ ```
153
+
154
+ # Install
155
+
156
+ ```bash
157
+ conda create --name mixtralkit python=3.10 pytorch torchvision pytorch-cuda -c nvidia -c pytorch -y
158
+ conda activate mixtralkit
159
+
160
+ git clone https://github.com/open-compass/MixtralKit
161
+ cd MixtralKit/
162
+ pip install -r requirements.txt
163
+ pip install -e .
164
+
165
+ ln -s path/to/checkpoints_folder/ ckpts
166
+ ```
167
+
168
+ # Inference
169
+
170
+ ## Text Completion
171
+ ```bash
172
+ python tools/example.py -m ./ckpts -t ckpts/tokenizer.model --num-gpus 2
173
+ ```
174
+
175
+ Expected Results:
176
+
177
+ ```bash
178
+ ==============================Example START==============================
179
+
180
+ [Prompt]:
181
+ Who are you?
182
+
183
+ [Response]:
184
+ I am a designer and theorist; a lecturer at the University of Malta and a partner in the firm Barbagallo and Baressi Design, which won the prestig
185
+ ious Compasso d’Oro award in 2004. I was educated in industrial and interior design in the United States
186
+
187
+ ==============================Example END==============================
188
+
189
+ ==============================Example START==============================
190
+
191
+ [Prompt]:
192
+ 1 + 1 -> 3
193
+ 2 + 2 -> 5
194
+ 3 + 3 -> 7
195
+ 4 + 4 ->
196
+
197
+ [Response]:
198
+ 9
199
+ 5 + 5 -> 11
200
+ 6 + 6 -> 13
201
+
202
+ #include <iostream>
203
+
204
+ using namespace std;
205
+
206
+ int addNumbers(int x, int y)
207
+ {
208
+ return x + y;
209
+ }
210
+
211
+ int main()
212
+ {
213
+
214
+ ==============================Example END==============================
215
+
216
+ ```
217
+
218
+
219
+ # Evaluation with OpenCompass
220
+
221
+ ## Step-1: Setup OpenCompass
222
+
223
+ - Clone and Install OpenCompass
224
+
225
+ ```bash
226
+ # assume you have already create the conda env named mixtralkit
227
+ conda activate mixtralkit
228
+
229
+ git clone https://github.com/open-compass/opencompass opencompass
230
+ cd opencompass
231
+
232
+ pip install -e .
233
+ ```
234
+
235
+ - Prepare Evaluation Dataset
236
+
237
+ ```bash
238
+ # Download dataset to data/ folder
239
+ wget https://github.com/open-compass/opencompass/releases/download/0.1.8.rc1/OpenCompassData-core-20231110.zip
240
+ unzip OpenCompassData-core-20231110.zip
241
+ ```
242
+
243
+ > If you need to evaluate the **humaneval**, please go to [Installation Guide](https://opencompass.readthedocs.io/en/latest/get_started/installation.html) for more information
244
+
245
+
246
+ ## Step-2: Pre-pare evaluation config and weights
247
+
248
+ ```bash
249
+ cd opencompass/
250
+ # link the example config into opencompass
251
+ ln -s path/to/MixtralKit/playground playground
252
+
253
+ # link the model weights into opencompass
254
+ mkdir -p ./models/mixtral/
255
+ ln -s path/to/checkpoints_folder/ ./models/mixtral/mixtral-8x7b-32kseqlen
256
+ ```
257
+
258
+ Currently, you should have the files structure like:
259
+
260
+ ```bash
261
+
262
+ opencompass/
263
+ ├── configs
264
+ │   ├── .....
265
+ │   └── .....
266
+ ├── models
267
+ │   └── mixtral
268
+ │   └── mixtral-8x7b-32kseqlen
269
+ ├── data/
270
+ ├── playground
271
+ │   └── eval_mixtral.py
272
+ │── ......
273
+ ```
274
+
275
+
276
+ ## Step-3: Run evaluation experiments
277
+
278
+ ```bash
279
+ HF_EVALUATE_OFFLINE=1 HF_DATASETS_OFFLINE=1 TRANSFORMERS_OFFLINE=1 python run.py playground/eval_mixtral.py
280
+ ```
281
+
282
+ # Acknowledgement
283
+ - [llama-mistral](https://github.com/dzhulgakov/llama-mistral)
284
+ - [llama](https://github.com/facebookresearch/llama)
285
+