Text Classification
Transformers
PyTorch
English
llama
text-generation-inference
saumyamalik commited on
Commit
62f5e51
·
verified ·
1 Parent(s): 664f2a0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -11,11 +11,11 @@ base_model:
11
  library_name: transformers
12
  ---
13
 
14
- # Model Card for {{MODEL_NAME_HERE}}
15
 
16
  <!-- Provide a quick summary of what the model is/does. -->
17
 
18
- {{MODEL_NAME_HERE}} is one of 7 sets of reward models (RMs) released with Reward Bench 2.
19
  We have released a large set of 70 total reward model checkpoints that we used to develop the benchmark and correlate it with downstream PPO / Best-of-N performance.
20
 
21
  [Models](https://huggingface.co/collections/allenai/reward-bench-2-683d2612a4b3e38a3e53bb51) | [Code](https://github.com/allenai/reward-bench) | [Eval. Dataset v2](https://huggingface.co/datasets/allenai/reward-bench-2) | [Results v2](https://huggingface.co/datasets/allenai/reward-bench-2-results) | [Paper](https://github.com/allenai/reward-bench/blob/main/paper-v2.pdf)
@@ -44,7 +44,7 @@ rm = AutoModelForSequenceClassification("allenai/Llama-3.1-70B-Instruct-RM-RB2",
44
  - **Training code:** https://github.com/allenai/open-instruct
45
  - **Language(s) (NLP):** en
46
  - **License:** Llama 3.1 Community License Agreement
47
- - **Finetuned from model [optional]:** {{TODO_BASE_MODEL_HERE}}
48
 
49
  ## License
50
 
 
11
  library_name: transformers
12
  ---
13
 
14
+ # Model Card for Llama-3.1-70B-Instruct-RM-RB2
15
 
16
  <!-- Provide a quick summary of what the model is/does. -->
17
 
18
+ Llama-3.1-70B-Instruct-RM-RB2 is one of 7 sets of reward models (RMs) released with Reward Bench 2.
19
  We have released a large set of 70 total reward model checkpoints that we used to develop the benchmark and correlate it with downstream PPO / Best-of-N performance.
20
 
21
  [Models](https://huggingface.co/collections/allenai/reward-bench-2-683d2612a4b3e38a3e53bb51) | [Code](https://github.com/allenai/reward-bench) | [Eval. Dataset v2](https://huggingface.co/datasets/allenai/reward-bench-2) | [Results v2](https://huggingface.co/datasets/allenai/reward-bench-2-results) | [Paper](https://github.com/allenai/reward-bench/blob/main/paper-v2.pdf)
 
44
  - **Training code:** https://github.com/allenai/open-instruct
45
  - **Language(s) (NLP):** en
46
  - **License:** Llama 3.1 Community License Agreement
47
+ - **Finetuned from model:** [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct)
48
 
49
  ## License
50