aloobun
/

d-SmolLM2-360M

Text Generation

text-generation-inference

Model card Files Files and versions

aloobun commited on Nov 27, 2024

Commit

2a5343b

·

verified ·

1 Parent(s): 1f88781

Update README.md

Files changed (1) hide show

README.md +18 -15

README.md CHANGED Viewed

@@ -13,21 +13,24 @@ This is a distillation experiment with SmolLM2-1.7B as teacher and SmolLM2-360M
 **Eval** results using SmolLM evaluation scripts (LightEval):
-|         Task          |Version| Metric |Value |   |Stderr|
-|-----------------------|------:|--------|-----:|---|-----:|
-|all                    |       |acc_norm|0.4653|±  |0.0115|
-|                       |       |qem     |0.0961|±  |0.0038|
-|custom:arc:_average:0  |       |acc_norm|0.5303|±  |0.0119|
-|custom:arc:challenge:0 |      0|acc_norm|0.3771|±  |0.0142|
-|custom:arc:easy:0      |      0|acc_norm|0.6835|±  |0.0095|
-|custom:commonsense_qa:0|      0|acc_norm|0.3784|±  |0.0139|
-|custom:gsm8k:5         |      0|qem     |0.0326|±  |0.0049|
-|custom:hellaswag:0     |      0|acc_norm|0.5418|±  |0.0050|
-|custom:mmlu_pro:0      |      0|acc_norm|0.1127|±  |0.0029|
-|custom:openbook_qa:0   |      0|acc_norm|0.3760|±  |0.0217|
-|custom:piqa:0          |      0|acc_norm|0.7214|±  |0.0105|
-|custom:trivia_qa:0     |      0|qem     |0.1596|±  |0.0027|
-|custom:winogrande:0    |      0|acc_norm|0.5312|±  |0.0140|
 **Eval** results using lm-eval evaluation scripts:

 **Eval** results using SmolLM evaluation scripts (LightEval):
+Eval results using SmolLM evaluation scripts show distilled model slightly gained over base, in a few tasks. Small margins.
+|         Task          | Version | Metric   | **aloobun/d-SmolLM2-360M** Value | **HuggingFaceTB/SmolLM2-360M** Value |
+|-----------------------|---------|----------|------------|----------|
+| all                   |         | acc_norm | **0.4653**     | **0.4642**   |
+|                       |         | qem      | 0.0961     | 0.1004   |
+| custom:arc:_average:0 |         | acc_norm | 0.5303     | 0.5305   |
+| custom:arc:challenge:0|    0    | acc_norm | 0.3771     | 0.3797   |
+| custom:arc:easy:0     |    0    | acc_norm | **0.6835**     | 0.6814   |
+| custom:commonsense_qa:0|    0   | acc_norm | **0.3784**     | 0.3759   |
+| custom:gsm8k:5        |    0    | qem      | 0.0326     | 0.0334   |
+| custom:hellaswag:0    |    0    | acc_norm | 0.5418     | 0.5456   |
+| custom:mmlu_pro:0     |    0    | acc_norm | 0.1127     | 0.1130   |
+| custom:openbook_qa:0  |    0    | acc_norm | **0.3760**     | 0.3720   |
+| custom:piqa:0         |    0    | acc_norm | 0.7214     | 0.7220   |
+| custom:trivia_qa:0    |    0    | qem      | 0.1596     | 0.1675   |
+| custom:winogrande:0   |    0    | acc_norm | **0.5312**     | 0.5241   |
 **Eval** results using lm-eval evaluation scripts: