Locutusque commited on
Commit
29bb759
·
1 Parent(s): 94a43b2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -11,7 +11,7 @@ Like version 1, this model will be trained on a single GPU, with hopes of gettin
11
 
12
  - Train on 1,000,000 examples of Skylion007/openwebtext at a learning rate of 3e-4 and batch size of 32
13
  - Once perplexity reaches an average of ~100, a cosine scheduler will be applied, and batch size will be increased to 4096
14
- - Once the perplexity reaches an average of 50, the model will be trained on graelo/wikipedia and mattymchen/refinedweb-3m, and the batch size will be increased to 393,216.
15
 
16
  - I'm open to any suggestions to modify this roadmap if you feel it isn't sufficient!
17
  # Disclaimer
 
11
 
12
  - Train on 1,000,000 examples of Skylion007/openwebtext at a learning rate of 3e-4 and batch size of 32
13
  - Once perplexity reaches an average of ~100, a cosine scheduler will be applied, and batch size will be increased to 4096
14
+ - Once the perplexity reaches an average of 50, the model will be trained on graelo/wikipedia and mattymchen/refinedweb-3m, and the batch size will be increased to 12,288.
15
 
16
  - I'm open to any suggestions to modify this roadmap if you feel it isn't sufficient!
17
  # Disclaimer