cckm commited on
Commit
b8ac79e
·
verified ·
1 Parent(s): 562f9ec

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -3,13 +3,15 @@ license: mit
3
  datasets:
4
  - tiiuae/falcon-refinedweb
5
  - HuggingFaceFW/fineweb
 
 
6
  language:
7
  - en
8
  pipeline_tag: text-generation
9
  library_name: PyTorch
10
  ---
11
 
12
- ## A deep and narrow, Mistral model (950M params)
13
  This checkpoint is for a small (950M params), deep and narrow (40 layers, hidden size=1440) Mistral model, as described in this [[blog post]](https://epsilons.ai/blog.html#post1_3). It is meant for edge applications.
14
 
15
  It was trained with ~400B tokens from RefinedWeb, and ~400B tokens from FineWeb (up to epoch 202418). It is a base model, and has not gone through instruct or chat fine-tuning.
 
3
  datasets:
4
  - tiiuae/falcon-refinedweb
5
  - HuggingFaceFW/fineweb
6
+ base_model:
7
+ - cckm/tinymistral_950m
8
  language:
9
  - en
10
  pipeline_tag: text-generation
11
  library_name: PyTorch
12
  ---
13
 
14
+ ## A deep and narrow Mistral model (950M params)
15
  This checkpoint is for a small (950M params), deep and narrow (40 layers, hidden size=1440) Mistral model, as described in this [[blog post]](https://epsilons.ai/blog.html#post1_3). It is meant for edge applications.
16
 
17
  It was trained with ~400B tokens from RefinedWeb, and ~400B tokens from FineWeb (up to epoch 202418). It is a base model, and has not gone through instruct or chat fine-tuning.