Update README.md
Browse files
README.md
CHANGED
|
@@ -78,6 +78,15 @@ You are OLMo 2, a helpful and harmless AI Assistant built by the Allen Institute
|
|
| 78 |
```
|
| 79 |
The model has not been trained with a specific system prompt in mind.
|
| 80 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 81 |
### Bias, Risks, and Limitations
|
| 82 |
|
| 83 |
The OLMo-2 models have limited safety training, but are not deployed automatically with in-the-loop filtering of responses like ChatGPT, so the model can produce problematic outputs (especially when prompted to do so).
|
|
|
|
| 78 |
```
|
| 79 |
The model has not been trained with a specific system prompt in mind.
|
| 80 |
|
| 81 |
+
### Intermediate Checkpoints
|
| 82 |
+
|
| 83 |
+
To facilitate research on RL finetuning, we have released our intermediate checkpoints during the model's RLVR training.
|
| 84 |
+
The model weights are saved every 20 training steps, and can be accessible in the revisions of the HuggingFace repository.
|
| 85 |
+
For example, you can load with:
|
| 86 |
+
```
|
| 87 |
+
olmo_model = AutoModelForCausalLM.from_pretrained("allenai/OLMo-2-0325-32B-Instruct", revision="step_200")
|
| 88 |
+
```
|
| 89 |
+
|
| 90 |
### Bias, Risks, and Limitations
|
| 91 |
|
| 92 |
The OLMo-2 models have limited safety training, but are not deployed automatically with in-the-loop filtering of responses like ChatGPT, so the model can produce problematic outputs (especially when prompted to do so).
|