cckm commited on
Commit
562f9ec
·
verified ·
1 Parent(s): 10e0e83

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -3
README.md CHANGED
@@ -1,3 +1,27 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - tiiuae/falcon-refinedweb
5
+ - HuggingFaceFW/fineweb
6
+ language:
7
+ - en
8
+ pipeline_tag: text-generation
9
+ library_name: PyTorch
10
+ ---
11
+
12
+ ## A deep and narrow, Mistral model (950M params)
13
+ This checkpoint is for a small (950M params), deep and narrow (40 layers, hidden size=1440) Mistral model, as described in this [[blog post]](https://epsilons.ai/blog.html#post1_3). It is meant for edge applications.
14
+
15
+ It was trained with ~400B tokens from RefinedWeb, and ~400B tokens from FineWeb (up to epoch 202418). It is a base model, and has not gone through instruct or chat fine-tuning.
16
+
17
+ LM Harness numbers:
18
+ | Benchmark | Result |
19
+ | ----- | ----- |
20
+ | arc_c | 0.2884 |
21
+ | arc_e | 0.5139 |
22
+ | boolq | 0.6089 |
23
+ | hellaswag | 0.5888 |
24
+ | obqa | 0.3280 |
25
+ | piqa | 0.7388 |
26
+ | siqa | 0.4038 |
27
+ | wino | 0.5627 |