abacusai
/

Slerp-CM-mist-dpo

Text Generation

text-generation-inference

Model card Files Files and versions Community

ArkaAbacus commited on Jan 17, 2024

Commit

a3edbde

·

verified ·

1 Parent(s): 0b2503b

Update README.md

Files changed (1) hide show

README.md +15 -13

README.md CHANGED Viewed

@@ -6,7 +6,21 @@ tags:
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c14f6b02e1f8f67c73bd05/V6OaYzWhNsFGwrl1M_ZjE.png)
-Slerp Merge of cookinai/CatMacaroni-Slerp and mncai/mistral-7b-dpo-v5
 .yaml file for mergekit
@@ -28,15 +42,3 @@ parameters:
     - value: 0.5 # fallback for rest of tensors
 dtype: float16
 ```
-Models chosen to achieve a mix of performance on reasoning datasets like GSM8k and conversational tasks.
-Evaluation results:
-| Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K |
-| --- | --- | --- | --- | --- | --- | --- |
-| 73.1    | 69.62 | 87.09 | 64.81 | 62.82 | 81.45 | 72.78 |
-The model did achieve an improvement in TruthfulQA over `cookinai/CatMacaroni-Slerp` and GSM8K over `mncai/mistral-7b-dpo-v5`
-which was the goal of the merge leading to an average score that was a better than both. It is unclear why the TruthfulQA metric
-is still meaningfully lower than the base `mncai/mistral-7b-dpo-v5`.

 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c14f6b02e1f8f67c73bd05/V6OaYzWhNsFGwrl1M_ZjE.png)
+This model is a [Slerp Merge](https://github.com/cg123/mergekit/blob/main/mergekit/merge_methods/slerp.py) of [cookinai/CatMacaroni-Slerp](https://huggingface.co/cookinai/CatMacaroni-Slerp) and [mncai/mistral-7b-dpo-v5](https://huggingface.co/mncai/mistral-7b-dpo-v5).
+# Evaluation Results
+### HuggingFace Leaderboard
+| Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K |
+| --- | --- | --- | --- | --- | --- | --- |
+| 73.1    | 69.62 | 87.09 | 64.81 | 62.82 | 81.45 | 72.78 |
+The model did achieve an improvement in TruthfulQA over `cookinai/CatMacaroni-Slerp` and GSM8K over `mncai/mistral-7b-dpo-v5`
+which was the goal of the merge leading to an average score that was a better than both. It is unclear why the TruthfulQA metric
+is still meaningfully lower than the base `mncai/mistral-7b-dpo-v5`.
+# Training Details
 .yaml file for mergekit
     - value: 0.5 # fallback for rest of tensors
 dtype: float16
 ```