Text Classification
Transformers
PyTorch
English
llama
text-generation-inference
saumyamalik commited on
Commit
664f2a0
·
verified ·
1 Parent(s): c3bf353

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -15,7 +15,7 @@ library_name: transformers
15
 
16
  <!-- Provide a quick summary of what the model is/does. -->
17
 
18
- {{MODEL_NAME_HERE}} is one of 6 sets of reward models (RMs) released with Reward Bench 2.
19
  We have released a large set of 70 total reward model checkpoints that we used to develop the benchmark and correlate it with downstream PPO / Best-of-N performance.
20
 
21
  [Models](https://huggingface.co/collections/allenai/reward-bench-2-683d2612a4b3e38a3e53bb51) | [Code](https://github.com/allenai/reward-bench) | [Eval. Dataset v2](https://huggingface.co/datasets/allenai/reward-bench-2) | [Results v2](https://huggingface.co/datasets/allenai/reward-bench-2-results) | [Paper](https://github.com/allenai/reward-bench/blob/main/paper-v2.pdf)
@@ -24,7 +24,7 @@ We have released a large set of 70 total reward model checkpoints that we used t
24
  ## Model Details
25
 
26
  The model is a standard classifier, `AutoModelForSequenceClassification` within the HuggingFace ecosystem, trained on binary preference data.
27
- For each model in this batch the main revision is the best model we obtained for that base model, and we include all other training data and hyperparamter combinations in the revisions for further research.
28
 
29
  To load a model from a revision, modify the following:
30
 
 
15
 
16
  <!-- Provide a quick summary of what the model is/does. -->
17
 
18
+ {{MODEL_NAME_HERE}} is one of 7 sets of reward models (RMs) released with Reward Bench 2.
19
  We have released a large set of 70 total reward model checkpoints that we used to develop the benchmark and correlate it with downstream PPO / Best-of-N performance.
20
 
21
  [Models](https://huggingface.co/collections/allenai/reward-bench-2-683d2612a4b3e38a3e53bb51) | [Code](https://github.com/allenai/reward-bench) | [Eval. Dataset v2](https://huggingface.co/datasets/allenai/reward-bench-2) | [Results v2](https://huggingface.co/datasets/allenai/reward-bench-2-results) | [Paper](https://github.com/allenai/reward-bench/blob/main/paper-v2.pdf)
 
24
  ## Model Details
25
 
26
  The model is a standard classifier, `AutoModelForSequenceClassification` within the HuggingFace ecosystem, trained on binary preference data.
27
+ For each model in this batch the main revision is the best model we obtained for that base model, and we include all other training data and hyperparameter combinations in the revisions for further research.
28
 
29
  To load a model from a revision, modify the following:
30