Model Card for Llama-3.1-Tulu-3-8B-SFT-RM-RB2

Llama-3.1-Tulu-3-8B-SFT-RM-RB2 is one of 7 sets of reward models (RMs) released with Reward Bench 2. We have released a large set of 70 total reward model checkpoints that we used to develop the benchmark and correlate it with downstream PPO / Best-of-N performance.

Models | Code | Eval. Dataset v2 | Results v2 | Paper

Model Details

The model is a standard classifier, AutoModelForSequenceClassification within the HuggingFace ecosystem, trained on binary preference data. For each model in this batch the main revision is the best model we obtained for that base model, and we include all other training data and hyperparameter combinations in the revisions for further research.

To load a model from a revision, modify the following:

from transformers import AutoModelForSequenceClassification
rm = AutoModelForSequenceClassification("allenai/Llama-3.1-Tulu-3-8B-SFT-RM-RB2", revision="2")

Revision	Training Data	Learning Rate	Num Epochs	RewardBench 2 Score	Factuality	Precise IF	Math	Safety	Focus	Ties
main	Combined	3e-6	2	68.2	73.3	38.8	57.9	89.8	88.9	60.6
1	Combined	3e-6	1	67.9	75.6	40.6	62.8	83.1	80.6	64.8
2	Combined	3e-6	1	67.7	74.3	40.0	61.2	84.2	80.4	66.3
3	Skywork	3e-6	2	66.7	62.9	37.5	60.7	88.0	93.7	57.5
4	Combined	3e-6	1	66.1	72.0	35.6	63.9	84.4	76.4	64.3
5	Skywork	3e-6	3	66.1	65.9	40.0	60.7	90.9	88.7	50.3
6	Skywork	3e-6	1	65.6	62.9	41.9	61.2	91.1	82.6	53.7
7	Tulu	3e-6	2	63.5	74.3	35.6	62.3	81.1	71.3	56.1
8	Tulu	3e-6	3	61.9	67.8	35.6	60.1	80.2	69.7	58.2
9	Tulu	3e-6	1	61.2	73.7	40.0	62.3	80.4	60.2	50.7
10	Tulu	1e-6	2	60.1	70.9	41.2	60.7	80.2	58.6	48.8
11	Tulu	3e-6	1	60.0	70.3	37.5	62.3	78.7	59.8	51.7
12	Tulu	1e-6	3	60.0	70.3	31.9	57.9	82.2	67.3	50.2
13	Tulu	3e-6	1	60.0	71.8	33.8	60.7	80.0	63.2	50.3
14	Tulu	3e-6	1	59.3	72.6	35.6	62.3	78.9	58.8	47.3
15	Tulu	3e-6	1	59.1	73.5	40.0	62.8	74.0	60.4	43.9
16	Tulu	1e-6	1	57.2	68.0	37.5	60.7	76.7	54.7	45.5
17	Tulu	2e-5	1	55.6	65.7	35.6	59.6	75.3	57.4	40.3
18	Tulu	2e-5	2	52.9	61.7	37.5	57.4	68.4	56.6	35.8
19	Tulu	2e-5	3	49.8	57.3	31.2	51.9	64.9	62.2	31.1

Developed by: Allen Institute for AI
Training code: https://github.com/allenai/open-instruct
Language(s) (NLP): en
License: Llama 3.1 Community License Agreement
Finetuned from model: allenai/Llama-3.1-Tulu-3-8B-SFT

License

All Llama 3.1 Tülu3 models are released under Meta's Llama 3.1 Community License Agreement. Llama 3.1 is licensed under the Llama 3.1 Community License, Copyright © Meta Platforms, Inc. Tülu3 is intended for research and educational use. For more information, please see our Responsible Use Guidelines.

The models have been fine-tuned using a dataset mix with outputs generated from third party models and are subject to additional terms: Gemma Terms of Use and Qwen License Agreement (models were improved using Qwen 2.5).

Citation

@misc{malik2025rewardbench2advancingreward,
      title={RewardBench 2: Advancing Reward Model Evaluation}, 
      author={Saumya Malik and Valentina Pyatkin and Sander Land and Jacob Morrison and Noah A. Smith and Hannaneh Hajishirzi and Nathan Lambert},
      year={2025},
      eprint={2506.01937},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2506.01937}, 
}

Model card contact: saumyam at allenai dot org

allenai
/

Llama-3.1-Tulu-3-8B-SFT-RM-RB2

Model Card for Llama-3.1-Tulu-3-8B-SFT-RM-RB2

Model Details

License

Citation

Model tree for allenai/Llama-3.1-Tulu-3-8B-SFT-RM-RB2

Datasets used to train allenai/Llama-3.1-Tulu-3-8B-SFT-RM-RB2

Collection including allenai/Llama-3.1-Tulu-3-8B-SFT-RM-RB2

Reward Bench 2