metascroy commited on
Commit
b9abd8d
·
verified ·
1 Parent(s): fb71c2a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -16
README.md CHANGED
@@ -157,35 +157,35 @@ Need to install lm-eval from source: https://github.com/EleutherAI/lm-evaluation
157
 
158
  ## baseline
159
  ```Shell
160
- lm_eval --model hf --model_args pretrained=microsoft/Phi-4-mini-instruct --tasks hellaswag --device cuda:0 --batch_size 64
161
  ```
162
 
163
  ## int8 dynamic activation and int4 weight quantization (8da4w)
164
  ```Shell
165
- lm_eval --model hf --model_args pretrained=pytorch/Phi-4-mini-instruct-8da4w --tasks hellaswag --device cuda:0 --batch_size 64
166
  ```
167
 
168
  | Benchmark | | |
169
  |----------------------------------|-------------|-------------------|
170
  | | Phi-4 mini-Ins | phi4-mini-8da4w|
171
  | **Popular aggregated benchmark** | | |
172
- | mmlu (0 shot) | 66.73 | 63.11 |
173
- | mmlu_pro (5-shot) | 46.43 | 35.31 |
174
  | **Reasoning** | | |
175
- | arc_challenge | 56.91 | 55.12 |
176
- | gpqa_main_zeroshot | 30.13 | 29.02 |
177
- | hellaswag | 54.57 | 53.23 |
178
- | openbookqa | 33.00 | 32.40 |
179
- | piqa (0-shot) | 77.64 | 76.66 |
180
- | siqa | 49.59 | 47.08 |
181
- | truthfulqa_mc2 (0-shot) | 48.39 | 47.99 |
182
- | winogrande (0-shot) | 71.11 | 70.17 |
183
  | **Multilingual** | | |
184
- | mgsm_en_cot_en | 60.80 | 58.8 |
185
  | **Math** | | |
186
- | gsm8k (5-shot) | 81.88 | 70.43 |
187
- | Mathqa (0-shot) | 42.31 | 41.57 |
188
- | **Overall** | 55.35 | 52.38 |
189
 
190
 
191
  # Exporting to ExecuTorch
 
157
 
158
  ## baseline
159
  ```Shell
160
+ lm_eval --model hf --model_args pretrained=microsoft/Phi-4-mini-instruct --tasks hellaswag --device cuda:0 --batch_size 8
161
  ```
162
 
163
  ## int8 dynamic activation and int4 weight quantization (8da4w)
164
  ```Shell
165
+ lm_eval --model hf --model_args pretrained=pytorch/Phi-4-mini-instruct-8da4w --tasks hellaswag --device cuda:0 --batch_size 8
166
  ```
167
 
168
  | Benchmark | | |
169
  |----------------------------------|-------------|-------------------|
170
  | | Phi-4 mini-Ins | phi4-mini-8da4w|
171
  | **Popular aggregated benchmark** | | |
172
+ | mmlu (0 shot) | 66.73 | 60.75 |
173
+ | mmlu_pro (5-shot) | 46.43 | 11.75 |
174
  | **Reasoning** | | |
175
+ | arc_challenge | 56.91 | 48.46 |
176
+ | gpqa_main_zeroshot | 30.13 | 30.80 |
177
+ | hellaswag | 54.57 | 50.35 |
178
+ | openbookqa | 33.00 | 30.40 |
179
+ | piqa (0-shot) | 77.64 | 74.43 |
180
+ | siqa | 49.59 | 44.98 |
181
+ | truthfulqa_mc2 (0-shot) | 48.39 | 51.35 |
182
+ | winogrande (0-shot) | 71.11 | 70.32 |
183
  | **Multilingual** | | |
184
+ | mgsm_en_cot_en | 60.80 | 57.60 |
185
  | **Math** | | |
186
+ | gsm8k (5-shot) | 81.88 | 61.71 |
187
+ | Mathqa (0-shot) | 42.31 | 36.95 |
188
+ | **Overall** | 55.35 | 48.45 |
189
 
190
 
191
  # Exporting to ExecuTorch