abacusai
/

Slerp-CM-mist-dpo

Text Generation

text-generation-inference

Model card Files Files and versions Community

siddartha-abacus commited on Jan 11, 2024

Commit

ff5d63f

·

verified ·

1 Parent(s): e847d9a

Update README.md

Files changed (1) hide show

README.md +11 -1

README.md CHANGED Viewed

@@ -27,4 +27,14 @@ parameters:
 dtype: float16
 ```
-Models chose to achieve a mix of performance on reasoning datasets like GSM8k and conversational tasks.

 dtype: float16
 ```
+Models chosen to achieve a mix of performance on reasoning datasets like GSM8k and conversational tasks.
+Evaluation results:
+| Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K |
+| --- | --- | --- | --- | --- | --- | --- |
+| 73.1    | 69.62 | 87.09 | 64.81 | 62.82 | 81.45 | 72.78 |
+The model did achieve an improvement in TruthfulQA over `cookinai/CatMacaroni-Slerp` and GSM8K over `mncai/mistral-7b-dpo-v5`
+which was the goal of the merge leading to an average score that was a better than both. It is unclear why the TruthfulQA metric
+is still meaningfully lower than the base `mncai/mistral-7b-dpo-v5`.