Update README.md
Browse files
README.md
CHANGED
@@ -10,27 +10,27 @@ datasets:
|
|
10 |
metrics:
|
11 |
- accuracy
|
12 |
tags:
|
13 |
-
- pair-ranker
|
14 |
-
- pair_ranker
|
15 |
- reward_model
|
16 |
- reward-model
|
17 |
-
- pairrm
|
18 |
-
- pair-rm
|
19 |
- RLHF
|
|
|
|
|
|
|
|
|
20 |
language:
|
21 |
- en
|
|
|
22 |
---
|
23 |
|
24 |
-
Inspired by [DeBERTa Reward Model Series](https://huggingface.co/OpenAssistant/reward-model-deberta-v3-large-v2)
|
25 |
-
`llm-blender/PairRM` is pairranker version finetuned specifically as a reward model using deberta-v3-large.
|
26 |
|
27 |
- Github: [https://github.com/yuchenlin/LLM-Blender](https://github.com/yuchenlin/LLM-Blender)
|
28 |
- Paper: [https://arxiv.org/abs/2306.02561](https://arxiv.org/abs/2306.02561)
|
29 |
- Space Demo: [https://huggingface.co/spaces/llm-blender/LLM-Blender](https://huggingface.co/spaces/llm-blender/LLM-Blender)
|
30 |
|
31 |
-
##
|
32 |
|
33 |
-
|
|
|
34 |
Since PairRanker contains some custom layers and tokens. We recommend use PairRM with our llm-blender code API.
|
35 |
- First install `llm-blender`
|
36 |
```bash
|
@@ -44,6 +44,9 @@ blender = llm_blender.Blender()
|
|
44 |
blender.loadranker("llm-blender/PairRM") # load PairRM
|
45 |
```
|
46 |
|
|
|
|
|
|
|
47 |
### Use case 1: Compare responses (Quality Evaluator)
|
48 |
|
49 |
- Then you can rank candidate responses with the following function
|
@@ -198,7 +201,9 @@ Two reasons to attribute:
|
|
198 |
|
199 |
|
200 |
|
201 |
-
|
|
|
|
|
202 |
If you are using PairRM in your research, please cite LLM-blender.
|
203 |
```bibtex
|
204 |
@inproceedings{llm-blender-2023,
|
@@ -209,3 +214,5 @@ If you are using PairRM in your research, please cite LLM-blender.
|
|
209 |
}
|
210 |
|
211 |
```
|
|
|
|
|
|
10 |
metrics:
|
11 |
- accuracy
|
12 |
tags:
|
|
|
|
|
13 |
- reward_model
|
14 |
- reward-model
|
|
|
|
|
15 |
- RLHF
|
16 |
+
- evaluation
|
17 |
+
- llm
|
18 |
+
- instruction
|
19 |
+
- reranking
|
20 |
language:
|
21 |
- en
|
22 |
+
pipeline_tag: text-generation
|
23 |
---
|
24 |
|
|
|
|
|
25 |
|
26 |
- Github: [https://github.com/yuchenlin/LLM-Blender](https://github.com/yuchenlin/LLM-Blender)
|
27 |
- Paper: [https://arxiv.org/abs/2306.02561](https://arxiv.org/abs/2306.02561)
|
28 |
- Space Demo: [https://huggingface.co/spaces/llm-blender/LLM-Blender](https://huggingface.co/spaces/llm-blender/LLM-Blender)
|
29 |
|
30 |
+
## Introduction
|
31 |
|
32 |
+
|
33 |
+
## Installation
|
34 |
Since PairRanker contains some custom layers and tokens. We recommend use PairRM with our llm-blender code API.
|
35 |
- First install `llm-blender`
|
36 |
```bash
|
|
|
44 |
blender.loadranker("llm-blender/PairRM") # load PairRM
|
45 |
```
|
46 |
|
47 |
+
|
48 |
+
## Usage
|
49 |
+
|
50 |
### Use case 1: Compare responses (Quality Evaluator)
|
51 |
|
52 |
- Then you can rank candidate responses with the following function
|
|
|
201 |
|
202 |
|
203 |
|
204 |
+
|
205 |
+
|
206 |
+
## Citation & Credits
|
207 |
If you are using PairRM in your research, please cite LLM-blender.
|
208 |
```bibtex
|
209 |
@inproceedings{llm-blender-2023,
|
|
|
214 |
}
|
215 |
|
216 |
```
|
217 |
+
|
218 |
+
|