ArkaAbacus
commited on
Commit
•
a3edbde
1
Parent(s):
0b2503b
Update README.md
Browse files
README.md
CHANGED
@@ -6,7 +6,21 @@ tags:
|
|
6 |
|
7 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c14f6b02e1f8f67c73bd05/V6OaYzWhNsFGwrl1M_ZjE.png)
|
8 |
|
9 |
-
Slerp Merge of cookinai/CatMacaroni-Slerp and mncai/mistral-7b-dpo-v5
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
|
11 |
.yaml file for mergekit
|
12 |
|
@@ -28,15 +42,3 @@ parameters:
|
|
28 |
- value: 0.5 # fallback for rest of tensors
|
29 |
dtype: float16
|
30 |
```
|
31 |
-
|
32 |
-
Models chosen to achieve a mix of performance on reasoning datasets like GSM8k and conversational tasks.
|
33 |
-
|
34 |
-
Evaluation results:
|
35 |
-
|
36 |
-
| Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K |
|
37 |
-
| --- | --- | --- | --- | --- | --- | --- |
|
38 |
-
| 73.1 | 69.62 | 87.09 | 64.81 | 62.82 | 81.45 | 72.78 |
|
39 |
-
|
40 |
-
The model did achieve an improvement in TruthfulQA over `cookinai/CatMacaroni-Slerp` and GSM8K over `mncai/mistral-7b-dpo-v5`
|
41 |
-
which was the goal of the merge leading to an average score that was a better than both. It is unclear why the TruthfulQA metric
|
42 |
-
is still meaningfully lower than the base `mncai/mistral-7b-dpo-v5`.
|
|
|
6 |
|
7 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c14f6b02e1f8f67c73bd05/V6OaYzWhNsFGwrl1M_ZjE.png)
|
8 |
|
9 |
+
This model is a [Slerp Merge](https://github.com/cg123/mergekit/blob/main/mergekit/merge_methods/slerp.py) of [cookinai/CatMacaroni-Slerp](https://huggingface.co/cookinai/CatMacaroni-Slerp) and [mncai/mistral-7b-dpo-v5](https://huggingface.co/mncai/mistral-7b-dpo-v5).
|
10 |
+
|
11 |
+
# Evaluation Results
|
12 |
+
|
13 |
+
### HuggingFace Leaderboard
|
14 |
+
|
15 |
+
| Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K |
|
16 |
+
| --- | --- | --- | --- | --- | --- | --- |
|
17 |
+
| 73.1 | 69.62 | 87.09 | 64.81 | 62.82 | 81.45 | 72.78 |
|
18 |
+
|
19 |
+
The model did achieve an improvement in TruthfulQA over `cookinai/CatMacaroni-Slerp` and GSM8K over `mncai/mistral-7b-dpo-v5`
|
20 |
+
which was the goal of the merge leading to an average score that was a better than both. It is unclear why the TruthfulQA metric
|
21 |
+
is still meaningfully lower than the base `mncai/mistral-7b-dpo-v5`.
|
22 |
+
|
23 |
+
# Training Details
|
24 |
|
25 |
.yaml file for mergekit
|
26 |
|
|
|
42 |
- value: 0.5 # fallback for rest of tensors
|
43 |
dtype: float16
|
44 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|