Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -101,19 +101,19 @@ lm_eval --model hf --model_args pretrained=pytorch/Phi-4-mini-instruct-float8dq
 | Benchmark                        |                |                     |
 |----------------------------------|----------------|---------------------|
-|                                  | Phi-4 mini-Ins | phi4-mini-int4wo    |
 | **Popular aggregated benchmark** |                |                     |
-| mmlu (0-shot)                    |                |  x              |
-| mmlu_pro (5-shot)                |                |  x              |
 | **Reasoning**                    |                |                     |
 | arc_challenge (0-shot)           | 56.91          |  56.66              |
 | gpqa_main_zeroshot               | 30.13          |  x              |
 | HellaSwag                        | 54.57          |  54.55              |
-| openbookqa                       | 33.00          |  x              |
-| piqa (0-shot)	                   | 77.64          |  x              |
-| social_iqa                       | 49.59          |  x              |
-| truthfulqa_mc2 (0-shot)          | 48.39          |  x              |
-| winogrande  (0-shot)             | 71.11          |  x              |
 | **Multilingual**                 |                |                     |
 | mgsm_en_cot_en                   | 60.8           |  60.0               |
 | **Math**                         |                |                     |

 | Benchmark                        |                |                     |
 |----------------------------------|----------------|---------------------|
+|                                  | Phi-4 mini-Ins | phi4-mini-float8dq  |
 | **Popular aggregated benchmark** |                |                     |
+| mmlu (0-shot)                    | 66.73          |  x              |
+| mmlu_pro (5-shot)                | 46.43          |  x              |
 | **Reasoning**                    |                |                     |
 | arc_challenge (0-shot)           | 56.91          |  56.66              |
 | gpqa_main_zeroshot               | 30.13          |  x              |
 | HellaSwag                        | 54.57          |  54.55              |
+| openbookqa                       | 33.00          |  33.60              |
+| piqa (0-shot)	                   | 77.64          |  77.48              |
+| social_iqa                       | 49.59          |  49.28              |
+| truthfulqa_mc2 (0-shot)          | 48.39          |  48.09              |
+| winogrande  (0-shot)             | 71.11          |  72.77              |
 | **Multilingual**                 |                |                     |
 | mgsm_en_cot_en                   | 60.8           |  60.0               |
 | **Math**                         |                |                     |