Update README.md
Browse files
README.md
CHANGED
|
@@ -60,25 +60,22 @@ Here are the evaluation results for DCLM-1B models on various tasks (using [llm-
|
|
| 60 |
| DCLM-1B | 45.2 | 28.1 | 47.5 |
|
| 61 |
| DCLM-1B-IT| 47.1 | 33.6 | 51.4 |
|
| 62 |
|
| 63 |
-
|
| 64 |
-
|
| 65 |
-
Moreover, we present our evaluation results on Length-Controlled Alpaca-Eval 2.0 to measure our instruction-following capabilities.
|
| 66 |
|
| 67 |
| Model | AlpacaEval2.0 LC Win-rate (%) |
|
| 68 |
|------------------------------------|------------------------------:|
|
| 69 |
-
|
|
| 70 |
-
| DCLM-IT-1B | **8.6** |
|
| 71 |
-
| DCLM-IT-7B | 16.6 |
|
| 72 |
-
| **Reported from the leaderboard** | |
|
| 73 |
-
| Gemma-Instruct-7B | 10.4 |
|
| 74 |
-
| Nous-Hermes-13B | 9.7 |
|
| 75 |
-
| DaVinci001 | 9.0 |
|
| 76 |
-
| LLaMA-2-Chat-13B | 8.4 |
|
| 77 |
-
| Alpaca-7B | 5.9 |
|
| 78 |
| Gemma-Instruct-2B | 5.4 |
|
| 79 |
| Phi-2 SFT | 5.9 |
|
| 80 |
-
|
|
| 81 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 82 |
|
| 83 |
## Example Code
|
| 84 |
|
|
|
|
| 60 |
| DCLM-1B | 45.2 | 28.1 | 47.5 |
|
| 61 |
| DCLM-1B-IT| 47.1 | 33.6 | 51.4 |
|
| 62 |
|
| 63 |
+
Moreover, we present our evaluation results on Length-Controlled Alpaca-Eval 2.0 to measure our instruction-following capabilities. We report results
|
| 64 |
+
from the leaderboard for non-DCLM models. We compare to state-of-the-art small models, and also include a few larger model sizes for comparison.
|
|
|
|
| 65 |
|
| 66 |
| Model | AlpacaEval2.0 LC Win-rate (%) |
|
| 67 |
|------------------------------------|------------------------------:|
|
| 68 |
+
| Qwen1.5 1.8B Chat | 2.6 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 69 |
| Gemma-Instruct-2B | 5.4 |
|
| 70 |
| Phi-2 SFT | 5.9 |
|
| 71 |
+
| DCLM-IT-1B | **8.6** |
|
| 72 |
+
| **Larger model sizes** | |
|
| 73 |
+
| Alpaca-7B | 5.9 |
|
| 74 |
+
| LLaMA-2-Chat-13B | 8.4 |
|
| 75 |
+
| DaVinci001 | 9.0 |
|
| 76 |
+
| Nous-Hermes-13B | 9.7 |
|
| 77 |
+
| Gemma-Instruct-7B | 10.4 |
|
| 78 |
+
| DCLM-IT-7B | 16.6 |
|
| 79 |
|
| 80 |
## Example Code
|
| 81 |
|