leaderboard-pr-bot
commited on
Commit
•
a6ff2f0
1
Parent(s):
16d36cd
Adding Evaluation Results
Browse filesThis is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr
The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.
If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions
README.md
CHANGED
@@ -90,4 +90,17 @@ ASSISTANT: To help your vehicle start, I will guide you through a step-by-step p
|
|
90 |
By following these steps, you should be able to diagnose and potentially fix the issue causing your car to not start. However, if after going through these checks and still having trouble, it is recommended to seek assistance from a qualified mechanic.
|
91 |
```
|
92 |
|
93 |
-
[Buy me a coffee](https://www.buymeacoffee.com/ehartford)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
90 |
By following these steps, you should be able to diagnose and potentially fix the issue causing your car to not start. However, if after going through these checks and still having trouble, it is recommended to seek assistance from a qualified mechanic.
|
91 |
```
|
92 |
|
93 |
+
[Buy me a coffee](https://www.buymeacoffee.com/ehartford)
|
94 |
+
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
95 |
+
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_ehartford__dolphin-llama2-7b)
|
96 |
+
|
97 |
+
| Metric | Value |
|
98 |
+
|-----------------------|---------------------------|
|
99 |
+
| Avg. | 41.88 |
|
100 |
+
| ARC (25-shot) | 46.59 |
|
101 |
+
| HellaSwag (10-shot) | 67.52 |
|
102 |
+
| MMLU (5-shot) | 48.37 |
|
103 |
+
| TruthfulQA (0-shot) | 49.72 |
|
104 |
+
| Winogrande (5-shot) | 63.77 |
|
105 |
+
| GSM8K (5-shot) | 5.69 |
|
106 |
+
| DROP (3-shot) | 11.53 |
|