aloobun commited on
Commit
2a5343b
·
verified ·
1 Parent(s): 1f88781

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -15
README.md CHANGED
@@ -13,21 +13,24 @@ This is a distillation experiment with SmolLM2-1.7B as teacher and SmolLM2-360M
13
 
14
  **Eval** results using SmolLM evaluation scripts (LightEval):
15
 
16
- | Task |Version| Metric |Value | |Stderr|
17
- |-----------------------|------:|--------|-----:|---|-----:|
18
- |all | |acc_norm|0.4653|± |0.0115|
19
- | | |qem |0.0961|± |0.0038|
20
- |custom:arc:_average:0 | |acc_norm|0.5303|± |0.0119|
21
- |custom:arc:challenge:0 | 0|acc_norm|0.3771|± |0.0142|
22
- |custom:arc:easy:0 | 0|acc_norm|0.6835|± |0.0095|
23
- |custom:commonsense_qa:0| 0|acc_norm|0.3784|± |0.0139|
24
- |custom:gsm8k:5 | 0|qem |0.0326|± |0.0049|
25
- |custom:hellaswag:0 | 0|acc_norm|0.5418|± |0.0050|
26
- |custom:mmlu_pro:0 | 0|acc_norm|0.1127|± |0.0029|
27
- |custom:openbook_qa:0 | 0|acc_norm|0.3760|± |0.0217|
28
- |custom:piqa:0 | 0|acc_norm|0.7214|± |0.0105|
29
- |custom:trivia_qa:0 | 0|qem |0.1596|± |0.0027|
30
- |custom:winogrande:0 | 0|acc_norm|0.5312|± |0.0140|
 
 
 
31
 
32
 
33
  **Eval** results using lm-eval evaluation scripts:
 
13
 
14
  **Eval** results using SmolLM evaluation scripts (LightEval):
15
 
16
+ Eval results using SmolLM evaluation scripts show distilled model slightly gained over base, in a few tasks. Small margins.
17
+
18
+ | Task | Version | Metric | **aloobun/d-SmolLM2-360M** Value | **HuggingFaceTB/SmolLM2-360M** Value |
19
+ |-----------------------|---------|----------|------------|----------|
20
+ | all | | acc_norm | **0.4653** | **0.4642** |
21
+ | | | qem | 0.0961 | 0.1004 |
22
+ | custom:arc:_average:0 | | acc_norm | 0.5303 | 0.5305 |
23
+ | custom:arc:challenge:0| 0 | acc_norm | 0.3771 | 0.3797 |
24
+ | custom:arc:easy:0 | 0 | acc_norm | **0.6835** | 0.6814 |
25
+ | custom:commonsense_qa:0| 0 | acc_norm | **0.3784** | 0.3759 |
26
+ | custom:gsm8k:5 | 0 | qem | 0.0326 | 0.0334 |
27
+ | custom:hellaswag:0 | 0 | acc_norm | 0.5418 | 0.5456 |
28
+ | custom:mmlu_pro:0 | 0 | acc_norm | 0.1127 | 0.1130 |
29
+ | custom:openbook_qa:0 | 0 | acc_norm | **0.3760** | 0.3720 |
30
+ | custom:piqa:0 | 0 | acc_norm | 0.7214 | 0.7220 |
31
+ | custom:trivia_qa:0 | 0 | qem | 0.1596 | 0.1675 |
32
+ | custom:winogrande:0 | 0 | acc_norm | **0.5312** | 0.5241 |
33
+
34
 
35
 
36
  **Eval** results using lm-eval evaluation scripts: