Text Generation
Transformers
Safetensors
English
reward model
conversational
cguna commited on
Commit
192e031
·
verified ·
1 Parent(s): a1cdbc3

Upload 11 files

Browse files
.gitattributes CHANGED
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ Benchmark_PRM_perf.png filter=lfs diff=lfs merge=lfs -text
37
+ PRM_BoN_rows.png filter=lfs diff=lfs merge=lfs -text
Benchmark_PRM_perf.png ADDED

Git LFS Details

  • SHA256: fe3a4dfb3f8a28b76b16ab110154b4af55facdcb7db9a226ae088692a372edfc
  • Pointer size: 131 Bytes
  • Size of remote file: 485 kB
PRM_BoN_rows.png ADDED

Git LFS Details

  • SHA256: 05d407b6a171b8329cf1843684e6aa560541ac879e7040234c061a7bfa1f9d4e
  • Pointer size: 131 Bytes
  • Size of remote file: 557 kB
README.md CHANGED
@@ -1,3 +1,209 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ pipeline_tag: text-generation
6
+ library_name: transformers
7
+ tags:
8
+ - reward model
9
+ base_model:
10
+ - ibm-granite/granite-3.3-8b-instruct
11
+ ---
12
+ # Granite-3.3-8B-LoRA-Math-PRM
13
+
14
+ **Model Summary**
15
+
16
+ Granite 3.3 8B LoRA Math PRM is a LoRA adapter for the 8-billion parameter language model, [Granite-3.3-8B-Instruct](https://huggingface.co/ibm-granite/granite-3.3-8b-instruct), built for use a generative process reward model (PRM) for process supervision in mathematical reasoning. Crucially, this model has only been trained on curated data from sources with permissive licenses, and we release this model under a Apache 2.0 license.
17
+
18
+ This model can be used to asses the correctness of each step of a mathematical reasoning process, and shows strong performance on Best-of-N evaluations for a variety of generators on Math-500, as well as strong error identification performance in both [ProcessBench](https://arxiv.org/abs/2412.06559) and [PRMBench](https://arxiv.org/abs/2501.03124).
19
+
20
+ - Developers: Granite Alignment Team, IBM Research
21
+ - Release Date: June 24th, 2025
22
+ - License: Apache 2.0
23
+
24
+ **Supported Languages**
25
+
26
+ This adapter has specifically been finetuned for English, however the base model supports English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese.
27
+
28
+ **Intended Use**
29
+
30
+ Granite 3.3 8B LoRA Math PRM is a LoRA adapter for [Granite-3.3-8B-Instruct](https://huggingface.co/ibm-granite/granite-3.3-8b-instruct), which gives the language model the ability of process supervision on mathematical reasoning steps by assessing the correctness of each step of a reasoning chain. At inference, the model takes a question and a response which can be broken down into generated steps, and for each step it determines whether the reasoning chain so far is correct (indicated by generating a single token, `Y`) or incorrect (indicated by generating `N`). The probability of generating `Y` can be treated as a numeric reward score in applications such as Best-of-N evaluation.
31
+
32
+ Before obtaining a response, the model expects the user generated prompt `"Is this response correct so far (Y/N)?"`, which should be added at the end of every step of the reasoning chain.
33
+
34
+
35
+
36
+ **Evaluation Results**
37
+
38
+ **a. Best-of-N Evaluation on Math-500**
39
+
40
+
41
+ We show the performance of MATH-500 with inference scaling on a variety of LLM generators, including [Granite-3.3-8B-Instruct](https://huggingface.co/ibm-granite/granite-3.3-8b-instruct), [Phi-4](https://huggingface.co/microsoft/phi-4), and [Qwen-2.5-Math-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Math-7B-Instruct), and show strong gains over Majority Voting with both Best-of-N and Weighted Majority Voting using Granite-3.3-8B-LoRA-Math-PRM.
42
+
43
+
44
+ <img src="PRM_BoN_rows.png" alt="PRM Performance on Math-500" width="5000"/>
45
+
46
+
47
+ We also compare the Best-of-N performance on Math-500 available PRMs on [Qwen-2.5-Math-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Math-7B-Instruct) generations, and show the strong performance of Granite-3.3-8B-LoRA-Math-PRM over majority voting:
48
+
49
+ | | 2| 4| 8| 16| 32| 64| 128| 256|
50
+ |- | - |- |- |- |- |- |- |- |
51
+ | Majority Voting |75.8 | 81.6 | 84.6 | 85.2 | 85.4 | 85.6 | 86.0 | 85.6 |
52
+ | **Granite-3.3-8B-LoRA-Math-PRM**| 81.6 | 84.2 | 84.8 | 86.2 | 86.8 | 87.2 | 88.0 | 87.2 |
53
+ | [Qwen2.5-Math-PRM-7B](https://huggingface.co/Qwen/Qwen2.5-Math-PRM-7B)| 82.0 | 84.8 | 86.6 | 87.0 | 88.2 | 89.0 | 88.8 | 89.0|
54
+ | [MathShepherd-Mistral-7B PRM 7B](https://huggingface.co/peiyi9979/math-shepherd-mistral-7b-prm)| 80.8 | 83.0 | 83.8 | 84.8 | 86.2 | 85.2 | 86.0 | 85.2 |
55
+ | [RLHFLow Llama3.1-8B-PRM-Deepseek-Data](https://huggingface.co/RLHFlow/Llama3.1-8B-PRM-Deepseek-Data)| 80.6 | 82.4 | 83.6 | 85.2 | 85.8 | 85.8 | 85.0 | 84.6 |
56
+
57
+
58
+ **b. ProcessBench**
59
+
60
+
61
+ <img src="Benchmark_PRM_perf.png" alt="PRM Performance on ProcessBench and PRMBench" height="400"/>
62
+
63
+ As shown above, Granite-3.3-8B-LoRA-Math-PRM shows strong performance on both ProcessBench (top-3) and PRMBench (top-2) compared to other models of the same parameter class, indicating a strong ability at error detection for reasoning tasks.
64
+
65
+ **c. PRMBench: Detailed Results**
66
+
67
+
68
+ | Model | Overall| NR. | NCL. | Avg (simplicity) | ES. | SC. | DC. | CI. | Avg (soundness) | PS. | DR. | MS. | Avg (sensitivity) |
69
+ |-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|------- |
70
+ | **Granite-3.3-8B-LoRA-Math-PRM** | 64.5 | 50.9 | 61.5 | 56.2 | 69.1 | 66.7 | 64.7 | 70.5 | 67.8 | 59.9 | 65.9 | 98.1 | 74.7
71
+ | [Qwen2.5-Math-PRM-7B](https://huggingface.co/Qwen/Qwen2.5-Math-PRM-7B) | 65.5 | 49.0 | 55.1 | 52.1 | 71.8 | 67.3 | 66.3 | 78.5 | 71.0 | 57.6 | 69.1 | 99.7 | 75.5
72
+ | [Skywork-PRM-7B](https://huggingface.co/Skywork/Skywork-o1-Open-PRM-Qwen-2.5-7B) | 65.1 | 56.4 | 62.8 | 59.6 | 69.4 | 67.1 | 67.7 | 69.9 | 68.5 | 60.9 | 65.8 | 93.2 | 73.7
73
+ | [Skywork-PRM-1.5B](https://huggingface.co/Skywork/Skywork-o1-Open-PRM-Qwen-2.5-1.5B) | 61.1 | 52 | 56.4 | 54.2 | 64.8 | 64.9 | 63.3 | 66.5 | 64.9 | 57.5 | 63.3 | 91.1 | 70.7
74
+ | [ReasonEval-34B](https://huggingface.co/GAIR/ReasonEval-34B) | 60.5 | 54.8| 48.1 | 51.5 | 66.4 | 60.3 | 57.8 | 67.5 | 63.0 | 57.7 | 64.3 | 97.2 | 73.1
75
+ | [ReasonEval-7B](https://huggingface.co/GAIR/ReasonEval-7B) | 60.1 | 61.0 | 50.1 | 55.6 | 62.1 | 65.9 | 61.5 | 66.0 | 63.9 | 55.7 | 58.0 | 99.5 | 71.1
76
+ | [RLHFlow-PRM-Mistral-8B](https://huggingface.co/RLHFlow/Llama3.1-8B-PRM-Mistral-Data) | 54.4 | 46.1 | 47.3 | 46.7 | 56.6 | 55.1 | 54.4 | 63.8 | 57.5 | 51.5 | 56.2 | 97.9 | 68.5
77
+ | [RLHFlow-PRM-Deepseek-8B](https://huggingface.co/RLHFlow/Llama3.1-8B-PRM-Deepseek-Data) | 54.2 | 46.4 | 48.9 | 47.6 | 55.7 | 55.0 | 53.2 | 66.2 | 57.5 | 49.0 | 55.4 | 99.8 | 68.1
78
+ | [MathShepherd-Mistral-7B](https://huggingface.co/peiyi9979/math-shepherd-mistral-7b-prm) | 47.0 | 44.0 | 50.3 | 47.1 | 49.4 | 44.5 | 41.3 | 47.7 | 45.7 | 47.2 | 48.6 | 86.1 | 60.7
79
+
80
+ **Training Data**
81
+
82
+ For training the Math PRM adapter, we curate training data from a diverse set of model responses to prompts from Math-specific datasets, specifically, [MetaMathQA](https://huggingface.co/datasets/meta-math/MetaMathQA), [MathInstruct](https://huggingface.co/datasets/TIGER-Lab/MathInstruct) and [NuminaMath](huggingface.co/datasets/AI-MO/NuminaMath-CoT). We leverage a diverse set of LLMs from the Granite Language Model Family, Phi-4, and Mixtral 8x22B to generate outputs, and use the Automatic Process Supervision method as described in [Luo et. al, 2024](https://arxiv.org/abs/2406.06592) for detecting steps with erros.
83
+
84
+ **Usage**
85
+
86
+
87
+ Sample use for obtaining PRM scores for a given response using Huggingface Transformers:
88
+
89
+ ```python
90
+ import torch
91
+ from transformers import AutoModelForCausalLM, AutoTokenizer
92
+ from typing import List
93
+
94
+ def prepare_input(query: str, steps: List[str], tokenizer: AutoTokenizer, correct_token: str, generation_prompt: str):
95
+ messages = []
96
+
97
+ for s_idx, step in enumerate(steps):
98
+ if s_idx == 0:
99
+ # append query and first step
100
+ message = {'role': 'user', 'content': query + " " + step + " " + generation_prompt}
101
+ else:
102
+ message = {'role': 'user', 'content': step + " " + generation_prompt}
103
+
104
+ messages.append(message)
105
+ messages.append({'role': 'assistant', 'content': correct_token})
106
+
107
+ input_message = tokenizer.apply_chat_template(messages, add_generation_prompt = False, tokenize = False)
108
+
109
+ return input_message
110
+
111
+ def get_step_ids(input_ids, tokenizer, correct_token, correct_token_id):
112
+ # get assistant turn indices
113
+ asst_text = "<|start_of_role|>assistant<|end_of_role|>" + correct_token + "<|end_of_text|>"
114
+ asst_toks = tokenizer(asst_text, add_special_tokens = False, return_tensors = "pt")['input_ids'][0]
115
+ asst_toks_before_correct_token = asst_toks[:torch.where(asst_toks == correct_token_id)[0].item()].tolist()
116
+
117
+ input_ids = input_ids[0]
118
+ # find batch index for assistant turn "Y", not just the correct_token_id
119
+ correct_token_indices = torch.where(input_ids == correct_token_id)[0].tolist()
120
+ prm_indices = []
121
+ for t_idx in correct_token_indices:
122
+ if input_ids[t_idx - len(asst_toks_before_correct_token) :t_idx].tolist() == asst_toks_before_correct_token:
123
+ prm_indices.append(t_idx-1) # the logits for token i predict the token i+1: so, we need to look at the PREVIOUS token logits
124
+
125
+ assert len(prm_indices)>0
126
+ return prm_indices
127
+
128
+ model_name_or_path = "ibm-granite/granite-3.3-8b-lora-math-prm"
129
+ model = AutoModelForCausalLM.from_pretrained(model_name_or_path, device_map = "auto")
130
+ tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
131
+
132
+ correct_token = "Y"
133
+ correct_token_id = tokenizer.encode(correct_token, add_special_tokens=False)[0]
134
+ generation_prompt = "Is this response correct so far (Y/N)?"
135
+
136
+
137
+ data = {
138
+ "query": "For breakfast, Anna bought a bagel for $x and a glass of orange juice for $0.85. At lunch, Anna spent $4.65 on a sandwich and $1.15 on a carton of milk. How much more money did Anna spend on lunch than on breakfast? If we know the answer to the above question is 4, what is the value of unknown variable x?",
139
+ "response":[
140
+ "At breakfast, Anna spent x dollars on a bagel and $0.85 on a glass of orange juice. The total cost of breakfast is x + $0.85.",
141
+ "At lunch, Anna spent $4.65 on a sandwich and $1.15 on a carton of milk. The total cost of lunch is $4.65 + $1.15 = $5.80.",
142
+ "To find out how much more money Anna spent on lunch than on breakfast, we subtract the cost of breakfast from the cost of lunch: $5.80 - (x + $0.85).",
143
+ "We are given that the difference is $4, so we can write: $5.80 - (x + $0.85) = $4.",
144
+ "Simplifying the left side, we get: $5.80 - x - $0.85 = $4.",
145
+ "Adding -$0.85 to both sides, we get: $5.80 -x = $3.15.",
146
+ "Subtracting $5.80 from both sides, we get: -x = -$2.65.",
147
+ "Dividing both sides by -1, we get: x = $2.65."
148
+ ]
149
+ }
150
+
151
+
152
+ formatted_data = prepare_input(query=data['query'], steps=data['response'], tokenizer=tokenizer, correct_token=correct_token, generation_prompt=generation_prompt)
153
+ input_ids = tokenizer.encode(formatted_data, return_tensors="pt").to(model.device)
154
+
155
+ with torch.no_grad():
156
+ logits = model(input_ids=input_ids).logits
157
+
158
+ # get step positions
159
+ prm_indices = get_step_ids(input_ids, tokenizer, correct_token, correct_token_id)
160
+
161
+ # get corresponding rewards: convert logits to probabilities and get the probability of the correct token id as reward
162
+ softmax = torch.nn.Softmax(dim=-1)
163
+ step_rewards = []
164
+ for prm_idx in prm_indices:
165
+ step_rewards.append(softmax(logits[0, prm_idx, :])[correct_token_id].item())
166
+
167
+ print(step_rewards)
168
+ # # [0.9998785257339478, 0.9996663331985474, 0.9991942048072815, 0.9993413090705872, 0.9996351003646851, 0.519490122795105, 0.9416136145591736, 0.9942548871040344]
169
+ ```
170
+
171
+ For use of the PRM as a verbalizer of correctness for a specific step:
172
+ ```python
173
+ import torch
174
+ from transformers import AutoModelForCausalLM, AutoTokenizer
175
+
176
+
177
+ model_name_or_path = "ibm-granite/granite-3.3-8b-lora-math-prm"
178
+ model = AutoModelForCausalLM.from_pretrained(model_name_or_path, device_map = "auto")
179
+ tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
180
+ generation_prompt = "Is this response correct so far (Y/N)?"
181
+
182
+
183
+ data = {
184
+ "query": "For breakfast, Anna bought a bagel for $x and a glass of orange juice for $0.85. At lunch, Anna spent $4.65 on a sandwich and $1.15 on a carton of milk. How much more money did Anna spend on lunch than on breakfast? If we know the answer to the above question is 4, what is the value of unknown variable x?",
185
+ "partial_response":
186
+ "At breakfast, Anna spent x dollars on a bagel and $0.85 on a glass of orange juice. The total cost of breakfast is x + $0.85. At lunch, Anna spent $4.65 on a sandwich and $1.15 on a carton of milk. The total cost of lunch is $4.65 + $1.15 = $5.80. To find out how much more money Anna spent on lunch than on breakfast, we subtract the cost of breakfast from the cost of lunch: $5.80 - (x + $0.85).",
187
+ }
188
+
189
+ # format the prompts
190
+ formatted_prompt = tokenizer.apply_chat_template([{'role':'user', 'content': data['query'] + " " + data['partial_response'] + " " + generation_prompt}], add_generation_prompt=True, tokenize=False)
191
+ inputs = tokenizer(formatted_prompt, return_tensors="pt")
192
+
193
+ # generate output
194
+ with torch.no_grad():
195
+ response = model.generate(inputs["input_ids"].to(model.device), attention_mask=inputs["attention_mask"].to(model.device), max_new_tokens=2)
196
+
197
+ output_text = tokenizer.decode(response[0])
198
+ print(output_text)
199
+ # # <|start_of_role|>assistant<|end_of_role|>Y<|end_of_text|>
200
+ ```
201
+
202
+
203
+ **Infrastructure**
204
+
205
+ We train Granite-3.3-8B-LoRA-Math-PRM using IBM's super computing cluster, Blue Vela, which is outfitted with NVIDIA H100 GPUs. This cluster provides a scalable and efficient infrastructure for training our models over multiple GPUs.
206
+
207
+ **Ethical Considerations and Limitations**
208
+
209
+ Granite-3.3-8B-LoRA-Math-PRM is an adapter for Granite-3.3-8B-Instruct. Since it inherits its foundation from the instruct model, all ethical considerations and limitations applicable to [Granite-3.3-8B-Instruct](https://huggingface.co/ibm-granite/granite-3.3-8b-instruct) remain relevant.
adapter_config.json ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "alpha_pattern": {},
3
+ "auto_mapping": null,
4
+ "base_model_name_or_path": "ibm-granite/granite-3.3-8b-instruct",
5
+ "bias": "none",
6
+ "corda_config": null,
7
+ "eva_config": null,
8
+ "exclude_modules": null,
9
+ "fan_in_fan_out": false,
10
+ "inference_mode": true,
11
+ "init_lora_weights": true,
12
+ "layer_replication": null,
13
+ "layers_pattern": null,
14
+ "layers_to_transform": null,
15
+ "loftq_config": {},
16
+ "lora_alpha": 8,
17
+ "lora_bias": false,
18
+ "lora_dropout": 0.1,
19
+ "megatron_config": null,
20
+ "megatron_core": "megatron.core",
21
+ "modules_to_save": null,
22
+ "peft_type": "LORA",
23
+ "r": 8,
24
+ "rank_pattern": {},
25
+ "revision": null,
26
+ "target_modules": [
27
+ "q_proj",
28
+ "k_proj",
29
+ "v_proj",
30
+ "o_proj",
31
+ "up_proj",
32
+ "down_proj",
33
+ "gate_proj"
34
+ ],
35
+ "task_type": "CAUSAL_LM",
36
+ "trainable_token_indices": null,
37
+ "use_dora": false,
38
+ "use_rslora": false
39
+ }
adapter_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:01f931c4a8bc84c2a04ab3a0463337147f74086a2c71cd479914b4627d1e8800
3
+ size 49554176
added_tokens.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "<|end_of_cite|>": 49156,
3
+ "<|end_of_plugin|>": 49158,
4
+ "<|end_of_role|>": 49153,
5
+ "<|start_of_cite|>": 49155,
6
+ "<|start_of_plugin|>": 49157,
7
+ "<|start_of_role|>": 49152,
8
+ "<|tool_call|>": 49154
9
+ }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
special_tokens_map.json ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ "<|start_of_role|>",
4
+ "<|end_of_role|>",
5
+ "<|tool_call|>",
6
+ "<|start_of_cite|>",
7
+ "<|end_of_cite|>",
8
+ "<|start_of_plugin|>",
9
+ "<|end_of_plugin|>"
10
+ ],
11
+ "bos_token": {
12
+ "content": "<|end_of_text|>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false
17
+ },
18
+ "eos_token": {
19
+ "content": "<|end_of_text|>",
20
+ "lstrip": false,
21
+ "normalized": false,
22
+ "rstrip": false,
23
+ "single_word": false
24
+ },
25
+ "pad_token": {
26
+ "content": "<|end_of_text|>",
27
+ "lstrip": false,
28
+ "normalized": false,
29
+ "rstrip": false,
30
+ "single_word": false
31
+ },
32
+ "unk_token": {
33
+ "content": "<|end_of_text|>",
34
+ "lstrip": false,
35
+ "normalized": false,
36
+ "rstrip": false,
37
+ "single_word": false
38
+ }
39
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,235 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": false,
3
+ "add_prefix_space": false,
4
+ "added_tokens_decoder": {
5
+ "0": {
6
+ "content": "<|end_of_text|>",
7
+ "lstrip": false,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false,
11
+ "special": true
12
+ },
13
+ "1": {
14
+ "content": "<fim_prefix>",
15
+ "lstrip": false,
16
+ "normalized": false,
17
+ "rstrip": false,
18
+ "single_word": false,
19
+ "special": true
20
+ },
21
+ "2": {
22
+ "content": "<fim_middle>",
23
+ "lstrip": false,
24
+ "normalized": false,
25
+ "rstrip": false,
26
+ "single_word": false,
27
+ "special": true
28
+ },
29
+ "3": {
30
+ "content": "<fim_suffix>",
31
+ "lstrip": false,
32
+ "normalized": false,
33
+ "rstrip": false,
34
+ "single_word": false,
35
+ "special": true
36
+ },
37
+ "4": {
38
+ "content": "<fim_pad>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false,
43
+ "special": true
44
+ },
45
+ "5": {
46
+ "content": "<filename>",
47
+ "lstrip": false,
48
+ "normalized": false,
49
+ "rstrip": false,
50
+ "single_word": false,
51
+ "special": true
52
+ },
53
+ "6": {
54
+ "content": "<gh_stars>",
55
+ "lstrip": false,
56
+ "normalized": false,
57
+ "rstrip": false,
58
+ "single_word": false,
59
+ "special": true
60
+ },
61
+ "7": {
62
+ "content": "<issue_start>",
63
+ "lstrip": false,
64
+ "normalized": false,
65
+ "rstrip": false,
66
+ "single_word": false,
67
+ "special": true
68
+ },
69
+ "8": {
70
+ "content": "<issue_comment>",
71
+ "lstrip": false,
72
+ "normalized": false,
73
+ "rstrip": false,
74
+ "single_word": false,
75
+ "special": true
76
+ },
77
+ "9": {
78
+ "content": "<issue_closed>",
79
+ "lstrip": false,
80
+ "normalized": false,
81
+ "rstrip": false,
82
+ "single_word": false,
83
+ "special": true
84
+ },
85
+ "10": {
86
+ "content": "<jupyter_start>",
87
+ "lstrip": false,
88
+ "normalized": false,
89
+ "rstrip": false,
90
+ "single_word": false,
91
+ "special": true
92
+ },
93
+ "11": {
94
+ "content": "<jupyter_text>",
95
+ "lstrip": false,
96
+ "normalized": false,
97
+ "rstrip": false,
98
+ "single_word": false,
99
+ "special": true
100
+ },
101
+ "12": {
102
+ "content": "<jupyter_code>",
103
+ "lstrip": false,
104
+ "normalized": false,
105
+ "rstrip": false,
106
+ "single_word": false,
107
+ "special": true
108
+ },
109
+ "13": {
110
+ "content": "<jupyter_output>",
111
+ "lstrip": false,
112
+ "normalized": false,
113
+ "rstrip": false,
114
+ "single_word": false,
115
+ "special": true
116
+ },
117
+ "14": {
118
+ "content": "<empty_output>",
119
+ "lstrip": false,
120
+ "normalized": false,
121
+ "rstrip": false,
122
+ "single_word": false,
123
+ "special": true
124
+ },
125
+ "15": {
126
+ "content": "<commit_before>",
127
+ "lstrip": false,
128
+ "normalized": false,
129
+ "rstrip": false,
130
+ "single_word": false,
131
+ "special": true
132
+ },
133
+ "16": {
134
+ "content": "<commit_msg>",
135
+ "lstrip": false,
136
+ "normalized": false,
137
+ "rstrip": false,
138
+ "single_word": false,
139
+ "special": true
140
+ },
141
+ "17": {
142
+ "content": "<commit_after>",
143
+ "lstrip": false,
144
+ "normalized": false,
145
+ "rstrip": false,
146
+ "single_word": false,
147
+ "special": true
148
+ },
149
+ "18": {
150
+ "content": "<reponame>",
151
+ "lstrip": false,
152
+ "normalized": false,
153
+ "rstrip": false,
154
+ "single_word": false,
155
+ "special": true
156
+ },
157
+ "49152": {
158
+ "content": "<|start_of_role|>",
159
+ "lstrip": false,
160
+ "normalized": false,
161
+ "rstrip": false,
162
+ "single_word": false,
163
+ "special": true
164
+ },
165
+ "49153": {
166
+ "content": "<|end_of_role|>",
167
+ "lstrip": false,
168
+ "normalized": false,
169
+ "rstrip": false,
170
+ "single_word": false,
171
+ "special": true
172
+ },
173
+ "49154": {
174
+ "content": "<|tool_call|>",
175
+ "lstrip": false,
176
+ "normalized": false,
177
+ "rstrip": false,
178
+ "single_word": false,
179
+ "special": true
180
+ },
181
+ "49155": {
182
+ "content": "<|start_of_cite|>",
183
+ "lstrip": false,
184
+ "normalized": false,
185
+ "rstrip": false,
186
+ "single_word": false,
187
+ "special": true
188
+ },
189
+ "49156": {
190
+ "content": "<|end_of_cite|>",
191
+ "lstrip": false,
192
+ "normalized": false,
193
+ "rstrip": false,
194
+ "single_word": false,
195
+ "special": true
196
+ },
197
+ "49157": {
198
+ "content": "<|start_of_plugin|>",
199
+ "lstrip": false,
200
+ "normalized": false,
201
+ "rstrip": false,
202
+ "single_word": false,
203
+ "special": true
204
+ },
205
+ "49158": {
206
+ "content": "<|end_of_plugin|>",
207
+ "lstrip": false,
208
+ "normalized": false,
209
+ "rstrip": false,
210
+ "single_word": false,
211
+ "special": true
212
+ }
213
+ },
214
+ "additional_special_tokens": [
215
+ "<|start_of_role|>",
216
+ "<|end_of_role|>",
217
+ "<|tool_call|>",
218
+ "<|start_of_cite|>",
219
+ "<|end_of_cite|>",
220
+ "<|start_of_plugin|>",
221
+ "<|end_of_plugin|>"
222
+ ],
223
+ "bos_token": "<|end_of_text|>",
224
+ "chat_template": "{# Alias tools -> available_tools #}\n{%- if tools and not available_tools -%}\n {%- set available_tools = tools -%}\n{%- endif -%}\n{%- if messages[0]['role'] == 'system' %}\n {%- set system_message = messages[0]['content'] %}\n {%- set loop_messages = messages[1:] %}\n {%- else %}\n {%- set system_message = \"Knowledge Cutoff Date: April 2024.\nToday's Date: \" + strftime_now('%B %d, %Y') + \".\nYou are Granite, developed by IBM.\" %}\n {%- if available_tools and documents %}\n {%- set system_message = system_message + \" You are a helpful assistant with access to the following tools. When a tool is required to answer the user's query, respond only with <|tool_call|> followed by a JSON list of tools used. If a tool does not exist in the provided list of tools, notify the user that you do not have the ability to fulfill the request.\nWrite the response to the user's input by strictly aligning with the facts in the provided documents. If the information needed to answer the question is not available in the documents, inform the user that the question cannot be answered based on the available data.\" %}\n {%- elif available_tools %}\n {%- set system_message = system_message + \" You are a helpful assistant with access to the following tools. When a tool is required to answer the user's query, respond only with <|tool_call|> followed by a JSON list of tools used. If a tool does not exist in the provided list of tools, notify the user that you do not have the ability to fulfill the request.\" %}\n {%- elif documents %}\n {%- set system_message = system_message + \" Write the response to the user's input by strictly aligning with the facts in the provided documents. If the information needed to answer the question is not available in the documents, inform the user that the question cannot be answered based on the available data.\" %}\n {%- elif thinking %}\n {%- set system_message = system_message + \" You are a helpful AI assistant.\nRespond to every user query in a comprehensive and detailed way. You can write down your thoughts and reasoning process before responding. In the thought process, engage in a comprehensive cycle of analysis, summarization, exploration, reassessment, reflection, backtracing, and iteration to develop well-considered thinking process. In the response section, based on various attempts, explorations, and reflections from the thoughts section, systematically present the final solution that you deem correct. The response should summarize the thought process. Write your thoughts between <think></think> and write your response between <response></response> for each user query.\" %}\n {%- else %}\n {%- set system_message = system_message + \" You are a helpful AI assistant.\" %}\n {%- endif %}\n {%- if 'citations' in controls and documents %}\n {%- set system_message = system_message + '\nUse the symbols <|start_of_cite|> and <|end_of_cite|> to indicate when a fact comes from a document in the search result, e.g <|start_of_cite|> {document_id: 1}my fact <|end_of_cite|> for a fact from document 1. Afterwards, list all the citations with their corresponding documents in an ordered list.' %}\n {%- endif %}\n {%- if 'hallucinations' in controls and documents %}\n {%- set system_message = system_message + '\nFinally, after the response is written, include a numbered list of sentences from the response with a corresponding risk value that are hallucinated and not based in the documents.' %}\n {%- endif %}\n {%- set loop_messages = messages %}\n {%- endif %}\n {{- '<|start_of_role|>system<|end_of_role|>' + system_message + '<|end_of_text|>\n' }}\n {%- if available_tools %}\n {{- '<|start_of_role|>available_tools<|end_of_role|>' }}\n {{- available_tools | tojson(indent=4) }}\n {{- '<|end_of_text|>\n' }}\n {%- endif %}\n {%- if documents %}\n {%- for document in documents %}\n {{- '<|start_of_role|>document {\"document_id\": \"' + document['doc_id'] | string + '\"}<|end_of_role|>\n' }}\n {{- document['text'] }}\n {{- '<|end_of_text|>\n' }}\n {%- endfor %}\n {%- endif %}\n {%- for message in loop_messages %}\n {{- '<|start_of_role|>' + message['role'] + '<|end_of_role|>' + message['content'] + '<|end_of_text|>\n' }}\n {%- if loop.last and add_generation_prompt %}\n {{- '<|start_of_role|>assistant' }}\n {%- if controls %}\n {{- ' ' + controls | tojson()}}\n {%- endif %}\n {{- '<|end_of_role|>' }}\n {%- endif %}\n {%- endfor %}",
225
+ "clean_up_tokenization_spaces": true,
226
+ "eos_token": "<|end_of_text|>",
227
+ "errors": "replace",
228
+ "extra_special_tokens": {},
229
+ "model_max_length": 9223372036854775807,
230
+ "pad_token": "<|end_of_text|>",
231
+ "padding_side": "left",
232
+ "tokenizer_class": "GPT2Tokenizer",
233
+ "unk_token": "<|end_of_text|>",
234
+ "vocab_size": 49152
235
+ }
vocab.json ADDED
The diff for this file is too large to render. See raw diff