Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -58,7 +58,7 @@ logits, emb = model(inputs)
|
|
| 58 |
|
| 59 |
### Training Data
|
| 60 |
|
| 61 |
-
- **Pretraining corpus:** Our initial model chrono-gpt-v1-19991231 is pretrained on
|
| 62 |
- **Incremental updates:** Yearly updates from 2000 to 2024 with an additional 65 billion tokens of timestamped text.
|
| 63 |
|
| 64 |
### Training Procedure
|
|
|
|
| 58 |
|
| 59 |
### Training Data
|
| 60 |
|
| 61 |
+
- **Pretraining corpus:** Our initial model chrono-gpt-v1-19991231 is pretrained on 21 billion tokens of pre-2000, diverse, high-quality, and open-source text data to ensure no leakage of data afterwards.
|
| 62 |
- **Incremental updates:** Yearly updates from 2000 to 2024 with an additional 65 billion tokens of timestamped text.
|
| 63 |
|
| 64 |
### Training Procedure
|