Update README.md
Browse files
README.md
CHANGED
@@ -78,6 +78,15 @@ You are OLMo 2, a helpful and harmless AI Assistant built by the Allen Institute
|
|
78 |
```
|
79 |
The model has not been trained with a specific system prompt in mind.
|
80 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
81 |
### Bias, Risks, and Limitations
|
82 |
|
83 |
The OLMo-2 models have limited safety training, but are not deployed automatically with in-the-loop filtering of responses like ChatGPT, so the model can produce problematic outputs (especially when prompted to do so).
|
|
|
78 |
```
|
79 |
The model has not been trained with a specific system prompt in mind.
|
80 |
|
81 |
+
### Intermediate Checkpoints
|
82 |
+
|
83 |
+
To facilitate research on RL finetuning, we have released our intermediate checkpoints during the model's RLVR training.
|
84 |
+
The model weights are saved every 20 training steps, and can be accessible in the revisions of the HuggingFace repository.
|
85 |
+
For example, you can load with:
|
86 |
+
```
|
87 |
+
olmo_model = AutoModelForCausalLM.from_pretrained("allenai/OLMo-2-0325-32B-Instruct", revision="step_200")
|
88 |
+
```
|
89 |
+
|
90 |
### Bias, Risks, and Limitations
|
91 |
|
92 |
The OLMo-2 models have limited safety training, but are not deployed automatically with in-the-loop filtering of responses like ChatGPT, so the model can produce problematic outputs (especially when prompted to do so).
|