Upload 11 files
Browse files- .gitattributes +2 -0
- Benchmark_PRM_perf.png +3 -0
- PRM_BoN_rows.png +3 -0
- README.md +209 -3
- adapter_config.json +39 -0
- adapter_model.safetensors +3 -0
- added_tokens.json +9 -0
- merges.txt +0 -0
- special_tokens_map.json +39 -0
- tokenizer.json +0 -0
- tokenizer_config.json +235 -0
- vocab.json +0 -0
.gitattributes
CHANGED
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
36 |
+
Benchmark_PRM_perf.png filter=lfs diff=lfs merge=lfs -text
|
37 |
+
PRM_BoN_rows.png filter=lfs diff=lfs merge=lfs -text
|
Benchmark_PRM_perf.png
ADDED
![]() |
Git LFS Details
|
PRM_BoN_rows.png
ADDED
![]() |
Git LFS Details
|
README.md
CHANGED
@@ -1,3 +1,209 @@
|
|
1 |
-
---
|
2 |
-
license: apache-2.0
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
language:
|
4 |
+
- en
|
5 |
+
pipeline_tag: text-generation
|
6 |
+
library_name: transformers
|
7 |
+
tags:
|
8 |
+
- reward model
|
9 |
+
base_model:
|
10 |
+
- ibm-granite/granite-3.3-8b-instruct
|
11 |
+
---
|
12 |
+
# Granite-3.3-8B-LoRA-Math-PRM
|
13 |
+
|
14 |
+
**Model Summary**
|
15 |
+
|
16 |
+
Granite 3.3 8B LoRA Math PRM is a LoRA adapter for the 8-billion parameter language model, [Granite-3.3-8B-Instruct](https://huggingface.co/ibm-granite/granite-3.3-8b-instruct), built for use a generative process reward model (PRM) for process supervision in mathematical reasoning. Crucially, this model has only been trained on curated data from sources with permissive licenses, and we release this model under a Apache 2.0 license.
|
17 |
+
|
18 |
+
This model can be used to asses the correctness of each step of a mathematical reasoning process, and shows strong performance on Best-of-N evaluations for a variety of generators on Math-500, as well as strong error identification performance in both [ProcessBench](https://arxiv.org/abs/2412.06559) and [PRMBench](https://arxiv.org/abs/2501.03124).
|
19 |
+
|
20 |
+
- Developers: Granite Alignment Team, IBM Research
|
21 |
+
- Release Date: June 24th, 2025
|
22 |
+
- License: Apache 2.0
|
23 |
+
|
24 |
+
**Supported Languages**
|
25 |
+
|
26 |
+
This adapter has specifically been finetuned for English, however the base model supports English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese.
|
27 |
+
|
28 |
+
**Intended Use**
|
29 |
+
|
30 |
+
Granite 3.3 8B LoRA Math PRM is a LoRA adapter for [Granite-3.3-8B-Instruct](https://huggingface.co/ibm-granite/granite-3.3-8b-instruct), which gives the language model the ability of process supervision on mathematical reasoning steps by assessing the correctness of each step of a reasoning chain. At inference, the model takes a question and a response which can be broken down into generated steps, and for each step it determines whether the reasoning chain so far is correct (indicated by generating a single token, `Y`) or incorrect (indicated by generating `N`). The probability of generating `Y` can be treated as a numeric reward score in applications such as Best-of-N evaluation.
|
31 |
+
|
32 |
+
Before obtaining a response, the model expects the user generated prompt `"Is this response correct so far (Y/N)?"`, which should be added at the end of every step of the reasoning chain.
|
33 |
+
|
34 |
+
|
35 |
+
|
36 |
+
**Evaluation Results**
|
37 |
+
|
38 |
+
**a. Best-of-N Evaluation on Math-500**
|
39 |
+
|
40 |
+
|
41 |
+
We show the performance of MATH-500 with inference scaling on a variety of LLM generators, including [Granite-3.3-8B-Instruct](https://huggingface.co/ibm-granite/granite-3.3-8b-instruct), [Phi-4](https://huggingface.co/microsoft/phi-4), and [Qwen-2.5-Math-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Math-7B-Instruct), and show strong gains over Majority Voting with both Best-of-N and Weighted Majority Voting using Granite-3.3-8B-LoRA-Math-PRM.
|
42 |
+
|
43 |
+
|
44 |
+
<img src="PRM_BoN_rows.png" alt="PRM Performance on Math-500" width="5000"/>
|
45 |
+
|
46 |
+
|
47 |
+
We also compare the Best-of-N performance on Math-500 available PRMs on [Qwen-2.5-Math-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Math-7B-Instruct) generations, and show the strong performance of Granite-3.3-8B-LoRA-Math-PRM over majority voting:
|
48 |
+
|
49 |
+
| | 2| 4| 8| 16| 32| 64| 128| 256|
|
50 |
+
|- | - |- |- |- |- |- |- |- |
|
51 |
+
| Majority Voting |75.8 | 81.6 | 84.6 | 85.2 | 85.4 | 85.6 | 86.0 | 85.6 |
|
52 |
+
| **Granite-3.3-8B-LoRA-Math-PRM**| 81.6 | 84.2 | 84.8 | 86.2 | 86.8 | 87.2 | 88.0 | 87.2 |
|
53 |
+
| [Qwen2.5-Math-PRM-7B](https://huggingface.co/Qwen/Qwen2.5-Math-PRM-7B)| 82.0 | 84.8 | 86.6 | 87.0 | 88.2 | 89.0 | 88.8 | 89.0|
|
54 |
+
| [MathShepherd-Mistral-7B PRM 7B](https://huggingface.co/peiyi9979/math-shepherd-mistral-7b-prm)| 80.8 | 83.0 | 83.8 | 84.8 | 86.2 | 85.2 | 86.0 | 85.2 |
|
55 |
+
| [RLHFLow Llama3.1-8B-PRM-Deepseek-Data](https://huggingface.co/RLHFlow/Llama3.1-8B-PRM-Deepseek-Data)| 80.6 | 82.4 | 83.6 | 85.2 | 85.8 | 85.8 | 85.0 | 84.6 |
|
56 |
+
|
57 |
+
|
58 |
+
**b. ProcessBench**
|
59 |
+
|
60 |
+
|
61 |
+
<img src="Benchmark_PRM_perf.png" alt="PRM Performance on ProcessBench and PRMBench" height="400"/>
|
62 |
+
|
63 |
+
As shown above, Granite-3.3-8B-LoRA-Math-PRM shows strong performance on both ProcessBench (top-3) and PRMBench (top-2) compared to other models of the same parameter class, indicating a strong ability at error detection for reasoning tasks.
|
64 |
+
|
65 |
+
**c. PRMBench: Detailed Results**
|
66 |
+
|
67 |
+
|
68 |
+
| Model | Overall| NR. | NCL. | Avg (simplicity) | ES. | SC. | DC. | CI. | Avg (soundness) | PS. | DR. | MS. | Avg (sensitivity) |
|
69 |
+
|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|------- |
|
70 |
+
| **Granite-3.3-8B-LoRA-Math-PRM** | 64.5 | 50.9 | 61.5 | 56.2 | 69.1 | 66.7 | 64.7 | 70.5 | 67.8 | 59.9 | 65.9 | 98.1 | 74.7
|
71 |
+
| [Qwen2.5-Math-PRM-7B](https://huggingface.co/Qwen/Qwen2.5-Math-PRM-7B) | 65.5 | 49.0 | 55.1 | 52.1 | 71.8 | 67.3 | 66.3 | 78.5 | 71.0 | 57.6 | 69.1 | 99.7 | 75.5
|
72 |
+
| [Skywork-PRM-7B](https://huggingface.co/Skywork/Skywork-o1-Open-PRM-Qwen-2.5-7B) | 65.1 | 56.4 | 62.8 | 59.6 | 69.4 | 67.1 | 67.7 | 69.9 | 68.5 | 60.9 | 65.8 | 93.2 | 73.7
|
73 |
+
| [Skywork-PRM-1.5B](https://huggingface.co/Skywork/Skywork-o1-Open-PRM-Qwen-2.5-1.5B) | 61.1 | 52 | 56.4 | 54.2 | 64.8 | 64.9 | 63.3 | 66.5 | 64.9 | 57.5 | 63.3 | 91.1 | 70.7
|
74 |
+
| [ReasonEval-34B](https://huggingface.co/GAIR/ReasonEval-34B) | 60.5 | 54.8| 48.1 | 51.5 | 66.4 | 60.3 | 57.8 | 67.5 | 63.0 | 57.7 | 64.3 | 97.2 | 73.1
|
75 |
+
| [ReasonEval-7B](https://huggingface.co/GAIR/ReasonEval-7B) | 60.1 | 61.0 | 50.1 | 55.6 | 62.1 | 65.9 | 61.5 | 66.0 | 63.9 | 55.7 | 58.0 | 99.5 | 71.1
|
76 |
+
| [RLHFlow-PRM-Mistral-8B](https://huggingface.co/RLHFlow/Llama3.1-8B-PRM-Mistral-Data) | 54.4 | 46.1 | 47.3 | 46.7 | 56.6 | 55.1 | 54.4 | 63.8 | 57.5 | 51.5 | 56.2 | 97.9 | 68.5
|
77 |
+
| [RLHFlow-PRM-Deepseek-8B](https://huggingface.co/RLHFlow/Llama3.1-8B-PRM-Deepseek-Data) | 54.2 | 46.4 | 48.9 | 47.6 | 55.7 | 55.0 | 53.2 | 66.2 | 57.5 | 49.0 | 55.4 | 99.8 | 68.1
|
78 |
+
| [MathShepherd-Mistral-7B](https://huggingface.co/peiyi9979/math-shepherd-mistral-7b-prm) | 47.0 | 44.0 | 50.3 | 47.1 | 49.4 | 44.5 | 41.3 | 47.7 | 45.7 | 47.2 | 48.6 | 86.1 | 60.7
|
79 |
+
|
80 |
+
**Training Data**
|
81 |
+
|
82 |
+
For training the Math PRM adapter, we curate training data from a diverse set of model responses to prompts from Math-specific datasets, specifically, [MetaMathQA](https://huggingface.co/datasets/meta-math/MetaMathQA), [MathInstruct](https://huggingface.co/datasets/TIGER-Lab/MathInstruct) and [NuminaMath](huggingface.co/datasets/AI-MO/NuminaMath-CoT). We leverage a diverse set of LLMs from the Granite Language Model Family, Phi-4, and Mixtral 8x22B to generate outputs, and use the Automatic Process Supervision method as described in [Luo et. al, 2024](https://arxiv.org/abs/2406.06592) for detecting steps with erros.
|
83 |
+
|
84 |
+
**Usage**
|
85 |
+
|
86 |
+
|
87 |
+
Sample use for obtaining PRM scores for a given response using Huggingface Transformers:
|
88 |
+
|
89 |
+
```python
|
90 |
+
import torch
|
91 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
92 |
+
from typing import List
|
93 |
+
|
94 |
+
def prepare_input(query: str, steps: List[str], tokenizer: AutoTokenizer, correct_token: str, generation_prompt: str):
|
95 |
+
messages = []
|
96 |
+
|
97 |
+
for s_idx, step in enumerate(steps):
|
98 |
+
if s_idx == 0:
|
99 |
+
# append query and first step
|
100 |
+
message = {'role': 'user', 'content': query + " " + step + " " + generation_prompt}
|
101 |
+
else:
|
102 |
+
message = {'role': 'user', 'content': step + " " + generation_prompt}
|
103 |
+
|
104 |
+
messages.append(message)
|
105 |
+
messages.append({'role': 'assistant', 'content': correct_token})
|
106 |
+
|
107 |
+
input_message = tokenizer.apply_chat_template(messages, add_generation_prompt = False, tokenize = False)
|
108 |
+
|
109 |
+
return input_message
|
110 |
+
|
111 |
+
def get_step_ids(input_ids, tokenizer, correct_token, correct_token_id):
|
112 |
+
# get assistant turn indices
|
113 |
+
asst_text = "<|start_of_role|>assistant<|end_of_role|>" + correct_token + "<|end_of_text|>"
|
114 |
+
asst_toks = tokenizer(asst_text, add_special_tokens = False, return_tensors = "pt")['input_ids'][0]
|
115 |
+
asst_toks_before_correct_token = asst_toks[:torch.where(asst_toks == correct_token_id)[0].item()].tolist()
|
116 |
+
|
117 |
+
input_ids = input_ids[0]
|
118 |
+
# find batch index for assistant turn "Y", not just the correct_token_id
|
119 |
+
correct_token_indices = torch.where(input_ids == correct_token_id)[0].tolist()
|
120 |
+
prm_indices = []
|
121 |
+
for t_idx in correct_token_indices:
|
122 |
+
if input_ids[t_idx - len(asst_toks_before_correct_token) :t_idx].tolist() == asst_toks_before_correct_token:
|
123 |
+
prm_indices.append(t_idx-1) # the logits for token i predict the token i+1: so, we need to look at the PREVIOUS token logits
|
124 |
+
|
125 |
+
assert len(prm_indices)>0
|
126 |
+
return prm_indices
|
127 |
+
|
128 |
+
model_name_or_path = "ibm-granite/granite-3.3-8b-lora-math-prm"
|
129 |
+
model = AutoModelForCausalLM.from_pretrained(model_name_or_path, device_map = "auto")
|
130 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
|
131 |
+
|
132 |
+
correct_token = "Y"
|
133 |
+
correct_token_id = tokenizer.encode(correct_token, add_special_tokens=False)[0]
|
134 |
+
generation_prompt = "Is this response correct so far (Y/N)?"
|
135 |
+
|
136 |
+
|
137 |
+
data = {
|
138 |
+
"query": "For breakfast, Anna bought a bagel for $x and a glass of orange juice for $0.85. At lunch, Anna spent $4.65 on a sandwich and $1.15 on a carton of milk. How much more money did Anna spend on lunch than on breakfast? If we know the answer to the above question is 4, what is the value of unknown variable x?",
|
139 |
+
"response":[
|
140 |
+
"At breakfast, Anna spent x dollars on a bagel and $0.85 on a glass of orange juice. The total cost of breakfast is x + $0.85.",
|
141 |
+
"At lunch, Anna spent $4.65 on a sandwich and $1.15 on a carton of milk. The total cost of lunch is $4.65 + $1.15 = $5.80.",
|
142 |
+
"To find out how much more money Anna spent on lunch than on breakfast, we subtract the cost of breakfast from the cost of lunch: $5.80 - (x + $0.85).",
|
143 |
+
"We are given that the difference is $4, so we can write: $5.80 - (x + $0.85) = $4.",
|
144 |
+
"Simplifying the left side, we get: $5.80 - x - $0.85 = $4.",
|
145 |
+
"Adding -$0.85 to both sides, we get: $5.80 -x = $3.15.",
|
146 |
+
"Subtracting $5.80 from both sides, we get: -x = -$2.65.",
|
147 |
+
"Dividing both sides by -1, we get: x = $2.65."
|
148 |
+
]
|
149 |
+
}
|
150 |
+
|
151 |
+
|
152 |
+
formatted_data = prepare_input(query=data['query'], steps=data['response'], tokenizer=tokenizer, correct_token=correct_token, generation_prompt=generation_prompt)
|
153 |
+
input_ids = tokenizer.encode(formatted_data, return_tensors="pt").to(model.device)
|
154 |
+
|
155 |
+
with torch.no_grad():
|
156 |
+
logits = model(input_ids=input_ids).logits
|
157 |
+
|
158 |
+
# get step positions
|
159 |
+
prm_indices = get_step_ids(input_ids, tokenizer, correct_token, correct_token_id)
|
160 |
+
|
161 |
+
# get corresponding rewards: convert logits to probabilities and get the probability of the correct token id as reward
|
162 |
+
softmax = torch.nn.Softmax(dim=-1)
|
163 |
+
step_rewards = []
|
164 |
+
for prm_idx in prm_indices:
|
165 |
+
step_rewards.append(softmax(logits[0, prm_idx, :])[correct_token_id].item())
|
166 |
+
|
167 |
+
print(step_rewards)
|
168 |
+
# # [0.9998785257339478, 0.9996663331985474, 0.9991942048072815, 0.9993413090705872, 0.9996351003646851, 0.519490122795105, 0.9416136145591736, 0.9942548871040344]
|
169 |
+
```
|
170 |
+
|
171 |
+
For use of the PRM as a verbalizer of correctness for a specific step:
|
172 |
+
```python
|
173 |
+
import torch
|
174 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
175 |
+
|
176 |
+
|
177 |
+
model_name_or_path = "ibm-granite/granite-3.3-8b-lora-math-prm"
|
178 |
+
model = AutoModelForCausalLM.from_pretrained(model_name_or_path, device_map = "auto")
|
179 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
|
180 |
+
generation_prompt = "Is this response correct so far (Y/N)?"
|
181 |
+
|
182 |
+
|
183 |
+
data = {
|
184 |
+
"query": "For breakfast, Anna bought a bagel for $x and a glass of orange juice for $0.85. At lunch, Anna spent $4.65 on a sandwich and $1.15 on a carton of milk. How much more money did Anna spend on lunch than on breakfast? If we know the answer to the above question is 4, what is the value of unknown variable x?",
|
185 |
+
"partial_response":
|
186 |
+
"At breakfast, Anna spent x dollars on a bagel and $0.85 on a glass of orange juice. The total cost of breakfast is x + $0.85. At lunch, Anna spent $4.65 on a sandwich and $1.15 on a carton of milk. The total cost of lunch is $4.65 + $1.15 = $5.80. To find out how much more money Anna spent on lunch than on breakfast, we subtract the cost of breakfast from the cost of lunch: $5.80 - (x + $0.85).",
|
187 |
+
}
|
188 |
+
|
189 |
+
# format the prompts
|
190 |
+
formatted_prompt = tokenizer.apply_chat_template([{'role':'user', 'content': data['query'] + " " + data['partial_response'] + " " + generation_prompt}], add_generation_prompt=True, tokenize=False)
|
191 |
+
inputs = tokenizer(formatted_prompt, return_tensors="pt")
|
192 |
+
|
193 |
+
# generate output
|
194 |
+
with torch.no_grad():
|
195 |
+
response = model.generate(inputs["input_ids"].to(model.device), attention_mask=inputs["attention_mask"].to(model.device), max_new_tokens=2)
|
196 |
+
|
197 |
+
output_text = tokenizer.decode(response[0])
|
198 |
+
print(output_text)
|
199 |
+
# # <|start_of_role|>assistant<|end_of_role|>Y<|end_of_text|>
|
200 |
+
```
|
201 |
+
|
202 |
+
|
203 |
+
**Infrastructure**
|
204 |
+
|
205 |
+
We train Granite-3.3-8B-LoRA-Math-PRM using IBM's super computing cluster, Blue Vela, which is outfitted with NVIDIA H100 GPUs. This cluster provides a scalable and efficient infrastructure for training our models over multiple GPUs.
|
206 |
+
|
207 |
+
**Ethical Considerations and Limitations**
|
208 |
+
|
209 |
+
Granite-3.3-8B-LoRA-Math-PRM is an adapter for Granite-3.3-8B-Instruct. Since it inherits its foundation from the instruct model, all ethical considerations and limitations applicable to [Granite-3.3-8B-Instruct](https://huggingface.co/ibm-granite/granite-3.3-8b-instruct) remain relevant.
|
adapter_config.json
ADDED
@@ -0,0 +1,39 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"alpha_pattern": {},
|
3 |
+
"auto_mapping": null,
|
4 |
+
"base_model_name_or_path": "ibm-granite/granite-3.3-8b-instruct",
|
5 |
+
"bias": "none",
|
6 |
+
"corda_config": null,
|
7 |
+
"eva_config": null,
|
8 |
+
"exclude_modules": null,
|
9 |
+
"fan_in_fan_out": false,
|
10 |
+
"inference_mode": true,
|
11 |
+
"init_lora_weights": true,
|
12 |
+
"layer_replication": null,
|
13 |
+
"layers_pattern": null,
|
14 |
+
"layers_to_transform": null,
|
15 |
+
"loftq_config": {},
|
16 |
+
"lora_alpha": 8,
|
17 |
+
"lora_bias": false,
|
18 |
+
"lora_dropout": 0.1,
|
19 |
+
"megatron_config": null,
|
20 |
+
"megatron_core": "megatron.core",
|
21 |
+
"modules_to_save": null,
|
22 |
+
"peft_type": "LORA",
|
23 |
+
"r": 8,
|
24 |
+
"rank_pattern": {},
|
25 |
+
"revision": null,
|
26 |
+
"target_modules": [
|
27 |
+
"q_proj",
|
28 |
+
"k_proj",
|
29 |
+
"v_proj",
|
30 |
+
"o_proj",
|
31 |
+
"up_proj",
|
32 |
+
"down_proj",
|
33 |
+
"gate_proj"
|
34 |
+
],
|
35 |
+
"task_type": "CAUSAL_LM",
|
36 |
+
"trainable_token_indices": null,
|
37 |
+
"use_dora": false,
|
38 |
+
"use_rslora": false
|
39 |
+
}
|
adapter_model.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:01f931c4a8bc84c2a04ab3a0463337147f74086a2c71cd479914b4627d1e8800
|
3 |
+
size 49554176
|
added_tokens.json
ADDED
@@ -0,0 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"<|end_of_cite|>": 49156,
|
3 |
+
"<|end_of_plugin|>": 49158,
|
4 |
+
"<|end_of_role|>": 49153,
|
5 |
+
"<|start_of_cite|>": 49155,
|
6 |
+
"<|start_of_plugin|>": 49157,
|
7 |
+
"<|start_of_role|>": 49152,
|
8 |
+
"<|tool_call|>": 49154
|
9 |
+
}
|
merges.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
special_tokens_map.json
ADDED
@@ -0,0 +1,39 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"additional_special_tokens": [
|
3 |
+
"<|start_of_role|>",
|
4 |
+
"<|end_of_role|>",
|
5 |
+
"<|tool_call|>",
|
6 |
+
"<|start_of_cite|>",
|
7 |
+
"<|end_of_cite|>",
|
8 |
+
"<|start_of_plugin|>",
|
9 |
+
"<|end_of_plugin|>"
|
10 |
+
],
|
11 |
+
"bos_token": {
|
12 |
+
"content": "<|end_of_text|>",
|
13 |
+
"lstrip": false,
|
14 |
+
"normalized": false,
|
15 |
+
"rstrip": false,
|
16 |
+
"single_word": false
|
17 |
+
},
|
18 |
+
"eos_token": {
|
19 |
+
"content": "<|end_of_text|>",
|
20 |
+
"lstrip": false,
|
21 |
+
"normalized": false,
|
22 |
+
"rstrip": false,
|
23 |
+
"single_word": false
|
24 |
+
},
|
25 |
+
"pad_token": {
|
26 |
+
"content": "<|end_of_text|>",
|
27 |
+
"lstrip": false,
|
28 |
+
"normalized": false,
|
29 |
+
"rstrip": false,
|
30 |
+
"single_word": false
|
31 |
+
},
|
32 |
+
"unk_token": {
|
33 |
+
"content": "<|end_of_text|>",
|
34 |
+
"lstrip": false,
|
35 |
+
"normalized": false,
|
36 |
+
"rstrip": false,
|
37 |
+
"single_word": false
|
38 |
+
}
|
39 |
+
}
|
tokenizer.json
ADDED
The diff for this file is too large to render.
See raw diff
|
|
tokenizer_config.json
ADDED
@@ -0,0 +1,235 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"add_bos_token": false,
|
3 |
+
"add_prefix_space": false,
|
4 |
+
"added_tokens_decoder": {
|
5 |
+
"0": {
|
6 |
+
"content": "<|end_of_text|>",
|
7 |
+
"lstrip": false,
|
8 |
+
"normalized": false,
|
9 |
+
"rstrip": false,
|
10 |
+
"single_word": false,
|
11 |
+
"special": true
|
12 |
+
},
|
13 |
+
"1": {
|
14 |
+
"content": "<fim_prefix>",
|
15 |
+
"lstrip": false,
|
16 |
+
"normalized": false,
|
17 |
+
"rstrip": false,
|
18 |
+
"single_word": false,
|
19 |
+
"special": true
|
20 |
+
},
|
21 |
+
"2": {
|
22 |
+
"content": "<fim_middle>",
|
23 |
+
"lstrip": false,
|
24 |
+
"normalized": false,
|
25 |
+
"rstrip": false,
|
26 |
+
"single_word": false,
|
27 |
+
"special": true
|
28 |
+
},
|
29 |
+
"3": {
|
30 |
+
"content": "<fim_suffix>",
|
31 |
+
"lstrip": false,
|
32 |
+
"normalized": false,
|
33 |
+
"rstrip": false,
|
34 |
+
"single_word": false,
|
35 |
+
"special": true
|
36 |
+
},
|
37 |
+
"4": {
|
38 |
+
"content": "<fim_pad>",
|
39 |
+
"lstrip": false,
|
40 |
+
"normalized": false,
|
41 |
+
"rstrip": false,
|
42 |
+
"single_word": false,
|
43 |
+
"special": true
|
44 |
+
},
|
45 |
+
"5": {
|
46 |
+
"content": "<filename>",
|
47 |
+
"lstrip": false,
|
48 |
+
"normalized": false,
|
49 |
+
"rstrip": false,
|
50 |
+
"single_word": false,
|
51 |
+
"special": true
|
52 |
+
},
|
53 |
+
"6": {
|
54 |
+
"content": "<gh_stars>",
|
55 |
+
"lstrip": false,
|
56 |
+
"normalized": false,
|
57 |
+
"rstrip": false,
|
58 |
+
"single_word": false,
|
59 |
+
"special": true
|
60 |
+
},
|
61 |
+
"7": {
|
62 |
+
"content": "<issue_start>",
|
63 |
+
"lstrip": false,
|
64 |
+
"normalized": false,
|
65 |
+
"rstrip": false,
|
66 |
+
"single_word": false,
|
67 |
+
"special": true
|
68 |
+
},
|
69 |
+
"8": {
|
70 |
+
"content": "<issue_comment>",
|
71 |
+
"lstrip": false,
|
72 |
+
"normalized": false,
|
73 |
+
"rstrip": false,
|
74 |
+
"single_word": false,
|
75 |
+
"special": true
|
76 |
+
},
|
77 |
+
"9": {
|
78 |
+
"content": "<issue_closed>",
|
79 |
+
"lstrip": false,
|
80 |
+
"normalized": false,
|
81 |
+
"rstrip": false,
|
82 |
+
"single_word": false,
|
83 |
+
"special": true
|
84 |
+
},
|
85 |
+
"10": {
|
86 |
+
"content": "<jupyter_start>",
|
87 |
+
"lstrip": false,
|
88 |
+
"normalized": false,
|
89 |
+
"rstrip": false,
|
90 |
+
"single_word": false,
|
91 |
+
"special": true
|
92 |
+
},
|
93 |
+
"11": {
|
94 |
+
"content": "<jupyter_text>",
|
95 |
+
"lstrip": false,
|
96 |
+
"normalized": false,
|
97 |
+
"rstrip": false,
|
98 |
+
"single_word": false,
|
99 |
+
"special": true
|
100 |
+
},
|
101 |
+
"12": {
|
102 |
+
"content": "<jupyter_code>",
|
103 |
+
"lstrip": false,
|
104 |
+
"normalized": false,
|
105 |
+
"rstrip": false,
|
106 |
+
"single_word": false,
|
107 |
+
"special": true
|
108 |
+
},
|
109 |
+
"13": {
|
110 |
+
"content": "<jupyter_output>",
|
111 |
+
"lstrip": false,
|
112 |
+
"normalized": false,
|
113 |
+
"rstrip": false,
|
114 |
+
"single_word": false,
|
115 |
+
"special": true
|
116 |
+
},
|
117 |
+
"14": {
|
118 |
+
"content": "<empty_output>",
|
119 |
+
"lstrip": false,
|
120 |
+
"normalized": false,
|
121 |
+
"rstrip": false,
|
122 |
+
"single_word": false,
|
123 |
+
"special": true
|
124 |
+
},
|
125 |
+
"15": {
|
126 |
+
"content": "<commit_before>",
|
127 |
+
"lstrip": false,
|
128 |
+
"normalized": false,
|
129 |
+
"rstrip": false,
|
130 |
+
"single_word": false,
|
131 |
+
"special": true
|
132 |
+
},
|
133 |
+
"16": {
|
134 |
+
"content": "<commit_msg>",
|
135 |
+
"lstrip": false,
|
136 |
+
"normalized": false,
|
137 |
+
"rstrip": false,
|
138 |
+
"single_word": false,
|
139 |
+
"special": true
|
140 |
+
},
|
141 |
+
"17": {
|
142 |
+
"content": "<commit_after>",
|
143 |
+
"lstrip": false,
|
144 |
+
"normalized": false,
|
145 |
+
"rstrip": false,
|
146 |
+
"single_word": false,
|
147 |
+
"special": true
|
148 |
+
},
|
149 |
+
"18": {
|
150 |
+
"content": "<reponame>",
|
151 |
+
"lstrip": false,
|
152 |
+
"normalized": false,
|
153 |
+
"rstrip": false,
|
154 |
+
"single_word": false,
|
155 |
+
"special": true
|
156 |
+
},
|
157 |
+
"49152": {
|
158 |
+
"content": "<|start_of_role|>",
|
159 |
+
"lstrip": false,
|
160 |
+
"normalized": false,
|
161 |
+
"rstrip": false,
|
162 |
+
"single_word": false,
|
163 |
+
"special": true
|
164 |
+
},
|
165 |
+
"49153": {
|
166 |
+
"content": "<|end_of_role|>",
|
167 |
+
"lstrip": false,
|
168 |
+
"normalized": false,
|
169 |
+
"rstrip": false,
|
170 |
+
"single_word": false,
|
171 |
+
"special": true
|
172 |
+
},
|
173 |
+
"49154": {
|
174 |
+
"content": "<|tool_call|>",
|
175 |
+
"lstrip": false,
|
176 |
+
"normalized": false,
|
177 |
+
"rstrip": false,
|
178 |
+
"single_word": false,
|
179 |
+
"special": true
|
180 |
+
},
|
181 |
+
"49155": {
|
182 |
+
"content": "<|start_of_cite|>",
|
183 |
+
"lstrip": false,
|
184 |
+
"normalized": false,
|
185 |
+
"rstrip": false,
|
186 |
+
"single_word": false,
|
187 |
+
"special": true
|
188 |
+
},
|
189 |
+
"49156": {
|
190 |
+
"content": "<|end_of_cite|>",
|
191 |
+
"lstrip": false,
|
192 |
+
"normalized": false,
|
193 |
+
"rstrip": false,
|
194 |
+
"single_word": false,
|
195 |
+
"special": true
|
196 |
+
},
|
197 |
+
"49157": {
|
198 |
+
"content": "<|start_of_plugin|>",
|
199 |
+
"lstrip": false,
|
200 |
+
"normalized": false,
|
201 |
+
"rstrip": false,
|
202 |
+
"single_word": false,
|
203 |
+
"special": true
|
204 |
+
},
|
205 |
+
"49158": {
|
206 |
+
"content": "<|end_of_plugin|>",
|
207 |
+
"lstrip": false,
|
208 |
+
"normalized": false,
|
209 |
+
"rstrip": false,
|
210 |
+
"single_word": false,
|
211 |
+
"special": true
|
212 |
+
}
|
213 |
+
},
|
214 |
+
"additional_special_tokens": [
|
215 |
+
"<|start_of_role|>",
|
216 |
+
"<|end_of_role|>",
|
217 |
+
"<|tool_call|>",
|
218 |
+
"<|start_of_cite|>",
|
219 |
+
"<|end_of_cite|>",
|
220 |
+
"<|start_of_plugin|>",
|
221 |
+
"<|end_of_plugin|>"
|
222 |
+
],
|
223 |
+
"bos_token": "<|end_of_text|>",
|
224 |
+
"chat_template": "{# Alias tools -> available_tools #}\n{%- if tools and not available_tools -%}\n {%- set available_tools = tools -%}\n{%- endif -%}\n{%- if messages[0]['role'] == 'system' %}\n {%- set system_message = messages[0]['content'] %}\n {%- set loop_messages = messages[1:] %}\n {%- else %}\n {%- set system_message = \"Knowledge Cutoff Date: April 2024.\nToday's Date: \" + strftime_now('%B %d, %Y') + \".\nYou are Granite, developed by IBM.\" %}\n {%- if available_tools and documents %}\n {%- set system_message = system_message + \" You are a helpful assistant with access to the following tools. When a tool is required to answer the user's query, respond only with <|tool_call|> followed by a JSON list of tools used. If a tool does not exist in the provided list of tools, notify the user that you do not have the ability to fulfill the request.\nWrite the response to the user's input by strictly aligning with the facts in the provided documents. If the information needed to answer the question is not available in the documents, inform the user that the question cannot be answered based on the available data.\" %}\n {%- elif available_tools %}\n {%- set system_message = system_message + \" You are a helpful assistant with access to the following tools. When a tool is required to answer the user's query, respond only with <|tool_call|> followed by a JSON list of tools used. If a tool does not exist in the provided list of tools, notify the user that you do not have the ability to fulfill the request.\" %}\n {%- elif documents %}\n {%- set system_message = system_message + \" Write the response to the user's input by strictly aligning with the facts in the provided documents. If the information needed to answer the question is not available in the documents, inform the user that the question cannot be answered based on the available data.\" %}\n {%- elif thinking %}\n {%- set system_message = system_message + \" You are a helpful AI assistant.\nRespond to every user query in a comprehensive and detailed way. You can write down your thoughts and reasoning process before responding. In the thought process, engage in a comprehensive cycle of analysis, summarization, exploration, reassessment, reflection, backtracing, and iteration to develop well-considered thinking process. In the response section, based on various attempts, explorations, and reflections from the thoughts section, systematically present the final solution that you deem correct. The response should summarize the thought process. Write your thoughts between <think></think> and write your response between <response></response> for each user query.\" %}\n {%- else %}\n {%- set system_message = system_message + \" You are a helpful AI assistant.\" %}\n {%- endif %}\n {%- if 'citations' in controls and documents %}\n {%- set system_message = system_message + '\nUse the symbols <|start_of_cite|> and <|end_of_cite|> to indicate when a fact comes from a document in the search result, e.g <|start_of_cite|> {document_id: 1}my fact <|end_of_cite|> for a fact from document 1. Afterwards, list all the citations with their corresponding documents in an ordered list.' %}\n {%- endif %}\n {%- if 'hallucinations' in controls and documents %}\n {%- set system_message = system_message + '\nFinally, after the response is written, include a numbered list of sentences from the response with a corresponding risk value that are hallucinated and not based in the documents.' %}\n {%- endif %}\n {%- set loop_messages = messages %}\n {%- endif %}\n {{- '<|start_of_role|>system<|end_of_role|>' + system_message + '<|end_of_text|>\n' }}\n {%- if available_tools %}\n {{- '<|start_of_role|>available_tools<|end_of_role|>' }}\n {{- available_tools | tojson(indent=4) }}\n {{- '<|end_of_text|>\n' }}\n {%- endif %}\n {%- if documents %}\n {%- for document in documents %}\n {{- '<|start_of_role|>document {\"document_id\": \"' + document['doc_id'] | string + '\"}<|end_of_role|>\n' }}\n {{- document['text'] }}\n {{- '<|end_of_text|>\n' }}\n {%- endfor %}\n {%- endif %}\n {%- for message in loop_messages %}\n {{- '<|start_of_role|>' + message['role'] + '<|end_of_role|>' + message['content'] + '<|end_of_text|>\n' }}\n {%- if loop.last and add_generation_prompt %}\n {{- '<|start_of_role|>assistant' }}\n {%- if controls %}\n {{- ' ' + controls | tojson()}}\n {%- endif %}\n {{- '<|end_of_role|>' }}\n {%- endif %}\n {%- endfor %}",
|
225 |
+
"clean_up_tokenization_spaces": true,
|
226 |
+
"eos_token": "<|end_of_text|>",
|
227 |
+
"errors": "replace",
|
228 |
+
"extra_special_tokens": {},
|
229 |
+
"model_max_length": 9223372036854775807,
|
230 |
+
"pad_token": "<|end_of_text|>",
|
231 |
+
"padding_side": "left",
|
232 |
+
"tokenizer_class": "GPT2Tokenizer",
|
233 |
+
"unk_token": "<|end_of_text|>",
|
234 |
+
"vocab_size": 49152
|
235 |
+
}
|
vocab.json
ADDED
The diff for this file is too large to render.
See raw diff
|
|