File size: 4,732 Bytes
1c96e30 e8a8802 68ea409 e8a8802 1c96e30 670d183 1c96e30 670d183 a2527ee 07ef68c 1c96e30 a2527ee a6a2b5d 7206709 1c96e30 b761124 1013efd bb097d9 1c96e30 a2527ee 1c96e30 a2527ee 670d183 1c96e30 a2527ee 1c96e30 a2527ee 1c96e30 a2527ee 1c96e30 a2527ee 1c96e30 a2527ee 6779f9a a2527ee 1c96e30 a2527ee |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 |
---
license: llama3.1
language:
- en
pipeline_tag: text-classification
datasets:
- allenai/llama-3.1-tulu-3-8b-preference-mixture
- Skywork/Skywork-Reward-Preference-80K-v0.2
base_model:
- meta-llama/Llama-3.1-8B-Instruct
library_name: transformers
---
# Model Card for Llama-3.1-8B-Instruct-RM-RB2
<!-- Provide a quick summary of what the model is/does. -->
Llama-3.1-8B-Instruct-RM-RB2 is one of 7 sets of reward models (RMs) released with Reward Bench 2.
We have released a large set of 70 total reward model checkpoints that we used to develop the benchmark and correlate it with downstream PPO / Best-of-N performance.
[Models](https://huggingface.co/collections/allenai/reward-bench-2-683d2612a4b3e38a3e53bb51) | [Code](https://github.com/allenai/reward-bench) | [Eval. Dataset v2](https://huggingface.co/datasets/allenai/reward-bench-2) | [Results v2](https://huggingface.co/datasets/allenai/reward-bench-2-results) | [Paper](https://arxiv.org/abs/2506.01937)
## Model Details
The model is a standard classifier, `AutoModelForSequenceClassification` within the HuggingFace ecosystem, trained on binary preference data.
For each model in this batch the main revision is the best model we obtained for that base model, and we include all other training data and hyperparameter combinations in the revisions for further research.
To load a model from a revision, modify the following:
```python
from transformers import AutoModelForSequenceClassification
rm = AutoModelForSequenceClassification("allenai/Llama-3.1-8B-Instruct-RM-RB2", revision="2")
```
| Revision | Training Data | Learning Rate | Num Epochs | RewardBench 2 Score | Factuality | Precise IF | Math | Safety | Focus | Ties |
|----------|---------------|---------------|------------|---------------------|------------|------------|------|--------|-------|------|
| main | Combined | 3e-6 | 1 | 72.8 | 74.3 | 44.4 | 61.7 | 89.6 | 90.7 | 76.4 |
| 1 | Combined | 4e-6 | 1 | 72.7 | 73.5 | 43.1 | 63.4 | 89.3 | 89.7 | 77.0 |
| 2 | Combined | 1e-6 | 2 | 72.4 | 73.1 | 40.0 | 66.7 | 94.2 | 94.1 | 66.4 |
| 3 | Combined | 3e-6 | 2 | 72.1 | 71.2 | 38.8 | 66.1 | 90.7 | 91.7 | 74.1 |
| 4 | Combined | 2e-6 | 1 | 71.9 | 72.6 | 38.8 | 63.9 | 89.6 | 92.7 | 73.8 |
| 5 | Combined | 3e-6 | 1 | 71.9 | 73.1 | 39.4 | 60.7 | 89.8 | 93.7 | 74.7 |
| 6 | Combined | 3e-6 | 1 | 71.7 | 72.4 | 43.1 | 61.7 | 87.8 | 89.7 | 75.6 |
| 7 | Skywork | 3e-6 | 1 | 70.5 | 62.5 | 38.1 | 66.7 | 92.0 | 92.3 | 71.1 |
| 8 | Combined | 1e-6 | 1 | 70.4 | 69.5 | 39.4 | 65.6 | 88.7 | 85.9 | 73.3 |
| 9 | Tulu | 3e-6 | 1 | 69.4 | 75.4 | 45.0 | 63.9 | 86.7 | 76.2 | 69.1 |
| 10 | Tulu | 3e-6 | 2 | 68.1 | 71.4 | 44.4 | 62.8 | 86.4 | 76.0 | 67.8 |
| 11 | Tulu | 1e-6 | 2 | 67.5 | 67.2 | 40.0 | 63.4 | 87.6 | 77.4 | 69.8 |
| 12 | Tulu | 3e-6 | 1 | 67.5 | 72.4 | 40.6 | 62.8 | 84.2 | 75.4 | 69.8 |
| 13 | Combined | 2e-5 | 1 | 67.2 | 66.3 | 36.9 | 62.8 | 82.0 | 83.0 | 71.9 |
| 14 | Skywork | 1e-6 | 1 | 66.6 | 59.8 | 36.9 | 63.4 | 89.6 | 86.1 | 64.2 |
| 15 | Tulu | 1e-6 | 1 | 66.4 | 69.5 | 40.6 | 62.8 | 84.2 | 72.7 | 68.3 |
| 16 | Tulu | 3e-6 | 1 | 65.7 | 73.1 | 36.9 | 62.8 | 82.9 | 70.3 | 68.4 |
| 17 | Combined | 2e-5 | 2 | 62.1 | 63.6 | 37.5 | 59.0 | 82.7 | 80.2 | 49.5 |
- **Developed by:** Allen Institute for AI
- **Training code:** https://github.com/allenai/open-instruct
- **Language(s) (NLP):** en
- **License:** Llama 3.1 Community License Agreement
- **Finetuned from model:** [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct)
## License
All Llama 3.1 Tülu3 models are released under Meta's [Llama 3.1 Community License Agreement](https://www.llama.com/llama3_1/license/).
Llama 3.1 is licensed under the Llama 3.1 Community License, Copyright © Meta Platforms, Inc.
Tülu3 is intended for research and educational use.
For more information, please see our [Responsible Use Guidelines](https://allenai.org/responsible-use).
The models have been fine-tuned using a dataset mix with outputs generated from third party models and are subject to additional terms:
[Gemma Terms of Use](https://ai.google.dev/gemma/terms) and [Qwen License Agreement](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct/blob/main/LICENSE) (models were improved using Qwen 2.5).
## Citation
```
@misc{malik2025rewardbench2advancingreward,
title={RewardBench 2: Advancing Reward Model Evaluation},
author={Saumya Malik and Valentina Pyatkin and Sander Land and Jacob Morrison and Noah A. Smith and Hannaneh Hajishirzi and Nathan Lambert},
year={2025},
eprint={2506.01937},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2506.01937},
}
```
Model card contact: `saumyam at allenai dot org` |