Text Generation
Transformers
Safetensors
English
olmo2
conversational
natolambert commited on
Commit
e97307e
·
verified ·
1 Parent(s): 5942a2f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -0
README.md CHANGED
@@ -78,6 +78,15 @@ You are OLMo 2, a helpful and harmless AI Assistant built by the Allen Institute
78
  ```
79
  The model has not been trained with a specific system prompt in mind.
80
 
 
 
 
 
 
 
 
 
 
81
  ### Bias, Risks, and Limitations
82
 
83
  The OLMo-2 models have limited safety training, but are not deployed automatically with in-the-loop filtering of responses like ChatGPT, so the model can produce problematic outputs (especially when prompted to do so).
 
78
  ```
79
  The model has not been trained with a specific system prompt in mind.
80
 
81
+ ### Intermediate Checkpoints
82
+
83
+ To facilitate research on RL finetuning, we have released our intermediate checkpoints during the model's RLVR training.
84
+ The model weights are saved every 20 training steps, and can be accessible in the revisions of the HuggingFace repository.
85
+ For example, you can load with:
86
+ ```
87
+ olmo_model = AutoModelForCausalLM.from_pretrained("allenai/OLMo-2-0325-32B-Instruct", revision="step_200")
88
+ ```
89
+
90
  ### Bias, Risks, and Limitations
91
 
92
  The OLMo-2 models have limited safety training, but are not deployed automatically with in-the-loop filtering of responses like ChatGPT, so the model can produce problematic outputs (especially when prompted to do so).