Update README.md
Browse files
README.md
CHANGED
@@ -11,7 +11,7 @@ language:
|
|
11 |
|
12 |
# Model Card for OLMo 2 32B
|
13 |
|
14 |
-
We introduce OLMo 2 32B, to the family of 7B and 13B models featuring a 9-point increase in MMLU, among other evaluation improvements, compared to the original [OLMo 7B](https://huggingface.co/allenai/OLMo-7B) model. These gains come from training on [OLMo-mix-
|
15 |
|
16 |
OLMo is a series of **O**pen **L**anguage **Mo**dels designed to enable the science of language models.
|
17 |
These models are trained on the Dolma dataset. We have released all code, checkpoints, logs, and associated training details on [GitHub](https://github.com/allenai/OLMo-core).
|
@@ -160,7 +160,7 @@ Core model results for OLMo 2 32B are found below.
|
|
160 |
- 32B Model: ~1 epoch
|
161 |
|
162 |
#### Stage 2: Fine-tuning
|
163 |
-
- Dataset:
|
164 |
- Three training mixes:
|
165 |
- 100B tokens
|
166 |
- 100B tokens
|
|
|
11 |
|
12 |
# Model Card for OLMo 2 32B
|
13 |
|
14 |
+
We introduce OLMo 2 32B, to the family of 7B and 13B models featuring a 9-point increase in MMLU, among other evaluation improvements, compared to the original [OLMo 7B](https://huggingface.co/allenai/OLMo-7B) model. These gains come from training on [OLMo-mix-1124](https://huggingface.co/datasets/allenai/olmo-mix-1124) and Dolmino-mix-0325 (releasing soon) datasets and staged training approach.
|
15 |
|
16 |
OLMo is a series of **O**pen **L**anguage **Mo**dels designed to enable the science of language models.
|
17 |
These models are trained on the Dolma dataset. We have released all code, checkpoints, logs, and associated training details on [GitHub](https://github.com/allenai/OLMo-core).
|
|
|
160 |
- 32B Model: ~1 epoch
|
161 |
|
162 |
#### Stage 2: Fine-tuning
|
163 |
+
- Dataset: Dolmino-Mix-0325 (releasing soon)
|
164 |
- Three training mixes:
|
165 |
- 100B tokens
|
166 |
- 100B tokens
|