Update README.md
Browse files
README.md
CHANGED
@@ -75,44 +75,38 @@ Please tell me about how merged models can benefit from existent top-models.<|im
|
|
75 |
|
76 |
**Language Model evaluation Harness**
|
77 |
```
|
78 |
-
|
|
79 |
-
|
80 |
-
|
|
81 |
-
| | |acc_norm| 0.
|
82 |
-
|
|
83 |
-
|
|
84 |
-
|
|
85 |
-
|
|
86 |
-
|
|
87 |
-
| | |
|
88 |
-
|
|
89 |
-
| | |acc | 0.
|
90 |
-
|
|
91 |
-
|
|
92 |
-
|
|
93 |
-
|
|
94 |
-
| | |acc_norm| 0.
|
95 |
-
|
|
96 |
-
|
|
97 |
-
|
|
98 |
-
|
|
99 |
-
|
|
100 |
-
|
|
101 |
-
|
|
102 |
-
|
|
103 |
-
|
|
104 |
-
|
|
105 |
-
|
|
106 |
-
|
|
107 |
-
|
|
108 |
-
|
|
109 |
-
|
|
110 |
-
| | |ter | 0.4709|± |0.0037|
|
111 |
-
|wmt16-en-de | 0|bleu |24.9193|± |0.3536|
|
112 |
-
| | |chrf | 0.5267|± |0.0033|
|
113 |
-
| | |ter | 0.6506|± |0.0041|
|
114 |
-
|xnli_de | 0|acc | 0.4451|± |0.0070|
|
115 |
-
|xnli_en | 0|acc | 0.5581|± |0.0070|
|
116 |
```
|
117 |
|
118 |
## Disclaimer
|
|
|
75 |
|
76 |
**Language Model evaluation Harness**
|
77 |
```
|
78 |
+
|arc_challenge | 0|acc | 0.5555|± |0.0145|
|
79 |
+
| | |acc_norm| 0.5956|± |0.0143|
|
80 |
+
|arc_easy | 0|acc | 0.8388|± |0.0075|
|
81 |
+
| | |acc_norm| 0.8262|± |0.0078|
|
82 |
+
|boolq | 1|acc | 0.8725|± |0.0058|
|
83 |
+
|copa | 0|acc | 0.9100|± |0.0288|
|
84 |
+
|hellaswag | 0|acc | 0.6285|± |0.0048|
|
85 |
+
| | |acc_norm| 0.8125|± |0.0039|
|
86 |
+
|lambada_openai_mt_de| 0|ppl |45.7314|± |2.8280|
|
87 |
+
| | |acc | 0.4141|± |0.0069|
|
88 |
+
|lambada_standard | 0|ppl | 3.5467|± |0.0779|
|
89 |
+
| | |acc | 0.6922|± |0.0064|
|
90 |
+
|multirc | 1|acc | 0.1459|± |0.0114|
|
91 |
+
|openbookqa | 0|acc | 0.3640|± |0.0215|
|
92 |
+
| | |acc_norm| 0.4600|± |0.0223|
|
93 |
+
|piqa | 0|acc | 0.8123|± |0.0091|
|
94 |
+
| | |acc_norm| 0.8281|± |0.0088|
|
95 |
+
|race | 1|acc | 0.4507|± |0.0154|
|
96 |
+
|rte | 0|acc | 0.7040|± |0.0275|
|
97 |
+
|truthfulqa_mc | 1|mc1 | 0.3329|± |0.0165|
|
98 |
+
| | |mc2 | 0.4915|± |0.0150|
|
99 |
+
|webqs | 0|acc | 0.1924|± |0.0087|
|
100 |
+
|wic | 0|acc | 0.5752|± |0.0196|
|
101 |
+
|winogrande | 0|acc | 0.7301|± |0.0125|
|
102 |
+
|wsc | 0|acc | 0.6154|± |0.0479|
|
103 |
+
|drop | 1|em | 0.2140|± |0.0042|
|
104 |
+
| | |f1 | 0.4011|± |0.0041|
|
105 |
+
|triviaqa | 3|em | 0.6259|± |0.0036|
|
106 |
+
|wmt16-de-en | 0|bleu |39.2043|± |0.3982|
|
107 |
+
|wmt16-en-de | 0|bleu |25.5745|± |0.3492|
|
108 |
+
|xnli_de | 0|acc | 0.4547|± |0.0070|
|
109 |
+
|xnli_en | 0|acc | 0.5595|± |0.0070|
|
|
|
|
|
|
|
|
|
|
|
|
|
110 |
```
|
111 |
|
112 |
## Disclaimer
|