allenai
/

OLMo-2-0325-32B-Instruct

Text Generation

Model card Files Files and versions

natolambert commited on Mar 14

Commit

e97307e

·

verified ·

1 Parent(s): 5942a2f

Update README.md

Files changed (1) hide show

README.md +9 -0

README.md CHANGED Viewed

@@ -78,6 +78,15 @@ You are OLMo 2, a helpful and harmless AI Assistant built by the Allen Institute
 ```
 The model has not been trained with a specific system prompt in mind.
 ### Bias, Risks, and Limitations
 The OLMo-2 models have limited safety training, but are not deployed automatically with in-the-loop filtering of responses like ChatGPT, so the model can produce problematic outputs (especially when prompted to do so).

 ```
 The model has not been trained with a specific system prompt in mind.
+### Intermediate Checkpoints
+To facilitate research on RL finetuning, we have released our intermediate checkpoints during the model's RLVR training.
+The model weights are saved every 20 training steps, and can be accessible in the revisions of the HuggingFace repository.
+For example, you can load with:
+```
+olmo_model = AutoModelForCausalLM.from_pretrained("allenai/OLMo-2-0325-32B-Instruct", revision="step_200")
+```
 ### Bias, Risks, and Limitations
 The OLMo-2 models have limited safety training, but are not deployed automatically with in-the-loop filtering of responses like ChatGPT, so the model can produce problematic outputs (especially when prompted to do so).