jerryzh168 commited on
Commit
8b3ab58
·
verified ·
1 Parent(s): bf1e484

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -9
README.md CHANGED
@@ -119,16 +119,29 @@ lm_eval --model hf --model_args pretrained=pytorch/Phi-4-mini-instruct-float8dq
119
  `TODO: more complete eval results`
120
 
121
 
122
- | Benchmark | | |
123
- |----------------------------------|-------------|-------------------|
124
- | | Phi-4 mini-Ins | phi4-mini-float8dq |
125
- | **Popular aggregated benchmark** | | |
126
- | **Reasoning** | | |
127
- | HellaSwag | 54.57 | 54.55 |
128
- | **Multilingual** | | |
129
- | **Math** | | |
130
- | **Overall** | **TODO** | **TODO** |
 
 
 
 
 
 
 
 
 
 
 
 
131
 
 
132
  # Model Performance
133
 
134
  ## Results (H100 machine)
 
119
  `TODO: more complete eval results`
120
 
121
 
122
+ | Benchmark | | |
123
+ |----------------------------------|----------------|---------------------|
124
+ | | Phi-4 mini-Ins | phi4-mini-int4wo |
125
+ | **Popular aggregated benchmark** | | |
126
+ | mmlu (0-shot) | | x |
127
+ | mmlu_pro (5-shot) | | x |
128
+ | **Reasoning** | | |
129
+ | arc_challenge (0-shot) | | x |
130
+ | gpqa_main_zeroshot | | x |
131
+ | HellaSwag | 54.57 | 54.55 |
132
+ | openbookqa | | x |
133
+ | piqa (0-shot) | | x |
134
+ | social_iqa | | x |
135
+ | truthfulqa_mc2 (0-shot) | | x |
136
+ | winogrande (0-shot) | | x |
137
+ | **Multilingual** | | |
138
+ | mgsm_en_cot_en | | x |
139
+ | **Math** | | |
140
+ | gsm8k (5-shot) | | x |
141
+ | mathqa (0-shot) | | x |
142
+ | **Overall** | **TODO** | **TODO** |
143
 
144
+
145
  # Model Performance
146
 
147
  ## Results (H100 machine)