Update README.md
Browse files
README.md
CHANGED
@@ -102,6 +102,12 @@ Compared to Aleph Alpha Luminous Models
|
|
102 |
### BBH:
|
103 |

|
104 |
*performed with newest Language Model Evaluation Harness
|
|
|
|
|
|
|
|
|
|
|
|
|
105 |
### MT-Bench (German):
|
106 |

|
107 |
```
|
@@ -162,11 +168,6 @@ SauerkrautLM-3b-v1 2.581250
|
|
162 |
open_llama_3b_v2 1.456250
|
163 |
Llama-2-7b 1.181250
|
164 |
```
|
165 |
-
### MMLU:
|
166 |
-

|
167 |
-
### TruthfulQA:
|
168 |
-

|
169 |
-
|
170 |
### MT-Bench (English):
|
171 |

|
172 |
```
|
@@ -194,6 +195,8 @@ SauerkrautLM-7b-HerO <--- 7.409375
|
|
194 |
Mistral-7B-OpenOrca 6.915625
|
195 |
neural-chat-7b-v3-1 6.812500
|
196 |
```
|
|
|
|
|
197 |
### Additional German Benchmark results:
|
198 |

|
199 |
*performed with newest Language Model Evaluation Harness
|
|
|
102 |
### BBH:
|
103 |

|
104 |
*performed with newest Language Model Evaluation Harness
|
105 |
+
|
106 |
+
### MMLU:
|
107 |
+

|
108 |
+
### TruthfulQA:
|
109 |
+

|
110 |
+
|
111 |
### MT-Bench (German):
|
112 |

|
113 |
```
|
|
|
168 |
open_llama_3b_v2 1.456250
|
169 |
Llama-2-7b 1.181250
|
170 |
```
|
|
|
|
|
|
|
|
|
|
|
171 |
### MT-Bench (English):
|
172 |

|
173 |
```
|
|
|
195 |
Mistral-7B-OpenOrca 6.915625
|
196 |
neural-chat-7b-v3-1 6.812500
|
197 |
```
|
198 |
+
|
199 |
+
|
200 |
### Additional German Benchmark results:
|
201 |

|
202 |
*performed with newest Language Model Evaluation Harness
|