Adding benchmark comparison table
Browse files
README.md
CHANGED
@@ -16,6 +16,15 @@ license: apache-2.0
|
|
16 |
|
17 |
GGUFs can be found [here](https://huggingface.co/InferenceIllusionist/Excalibur-7b-GGUF)
|
18 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
19 |
[Magic-Dolphin-7b](https://huggingface.co/InferenceIllusionist/Magic-Dolphin-7b) was an unexpected surprise. Profoundly satisfied with it as a first attempt. For this follow-up I wanted to target the MMLU benchmark specifically.
|
20 |
The challenge this time was placing more weight on Merlinite-7b as an unknown quantity that hasn't been in the spotlight despite its novel LAB tuning method.
|
21 |
|
|
|
16 |
|
17 |
GGUFs can be found [here](https://huggingface.co/InferenceIllusionist/Excalibur-7b-GGUF)
|
18 |
|
19 |
+
### Performance Comparison
|
20 |
+
| Name | Avg. | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K |
|
21 |
+
| ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
|
22 |
+
| <b>Excalibur-7b</b> | <u><b>73.6</b></u> | <u><b>69.71</b></u> | <u><b>87.56</b></u> | <u><b>65.66</b></u> | <u><b>67.24</b></u> | <u><b>82.79</b></u> | <u><b>68.61</b></u> |
|
23 |
+
| Magic-Dolphin-7b | 67.48 | 65.78 | 85.61 | 64.64 | 58.01 | 79.64 | 51.18 |
|
24 |
+
| merlinite-7b | 64 | 63.65 | 84.52 | 64.91 | 50.15 | 79.72 | 41.09 |
|
25 |
+
|
26 |
+
|
27 |
+
### Methodology
|
28 |
[Magic-Dolphin-7b](https://huggingface.co/InferenceIllusionist/Magic-Dolphin-7b) was an unexpected surprise. Profoundly satisfied with it as a first attempt. For this follow-up I wanted to target the MMLU benchmark specifically.
|
29 |
The challenge this time was placing more weight on Merlinite-7b as an unknown quantity that hasn't been in the spotlight despite its novel LAB tuning method.
|
30 |
|