aiqtech rpand002 commited on
Commit
f441fa9
·
verified ·
0 Parent(s):

Duplicate from ibm-granite/granite-4.0-1b

Browse files

Co-authored-by: Rameswar Panda <[email protected]>

.gitattributes ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,600 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: transformers
4
+ tags:
5
+ - language
6
+ - granite-4.0
7
+ base_model:
8
+ - ibm-granite/granite-4.0-1b-base
9
+ ---
10
+
11
+ # Granite-4.0-1B
12
+
13
+ **Model Summary:**
14
+ Granite-4.0-1B is a lightweight instruct model finetuned from *Granite-4.0-1B-Base* using a combination of open source instruction datasets with permissive license and internally collected synthetic datasets. This model is developed using a diverse set of techniques including supervised finetuning, reinforcement learning, and model merging.
15
+
16
+ - **Developers:** Granite Team, IBM
17
+ - **HF Collection:** [Granite 4.0 Nano Language Models HF Collection](https://huggingface.co/collections/ibm-granite/granite-40-nano-language-models-68e5775c80b60e43b72cfa16)
18
+ - **GitHub Repository:** [ibm-granite/granite-4.0-nano-language-models](https://github.com/ibm-granite/granite-4.0-nano-language-models)
19
+ - **Website**: [Granite Docs](https://www.ibm.com/granite/docs/)
20
+ - **Release Date**: October 28, 2025
21
+ - **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
22
+
23
+ **Supported Languages:**
24
+ English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese. Users may fine-tune Granite 4.0 Nano models to support languages beyond those included in this list.
25
+
26
+ **Intended use:**
27
+ Granite 4.0 Nano instruct models feature strong instruction following capabilities bringing advanced AI capabilities within reach for on-device deployments and research use cases. Additionally, their compact size makes them well-suited for fine-tuning on specialized domains without requiring massive compute resources.
28
+
29
+ *Capabilities*
30
+ * Summarization
31
+ * Text classification
32
+ * Text extraction
33
+ * Question-answering
34
+ * Retrieval Augmented Generation (RAG)
35
+ * Code related tasks
36
+ * Function-calling tasks
37
+ * Multilingual dialog use cases
38
+ * Fill-In-the-Middle (FIM) code completions
39
+
40
+ <!-- <todo>Need to test the examples. (especially the tool calling and RAG ones)</todo>
41
+ -->
42
+
43
+ **Generation:**
44
+ This is a simple example of how to use Granite-4.0-1B model.
45
+
46
+ Install the following libraries:
47
+
48
+ ```shell
49
+ pip install torch torchvision torchaudio
50
+ pip install accelerate
51
+ pip install transformers
52
+ ```
53
+ Then, copy the snippet from the section that is relevant for your use case.
54
+
55
+ ```python
56
+ import torch
57
+ from transformers import AutoModelForCausalLM, AutoTokenizer
58
+
59
+ device = "cuda"
60
+ model_path = "ibm-granite/granite-4.0-1b"
61
+ tokenizer = AutoTokenizer.from_pretrained(model_path)
62
+ # drop device_map if running on CPU
63
+ model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
64
+ model.eval()
65
+ # change input text as desired
66
+ chat = [
67
+ { "role": "user", "content": "Please list one IBM Research laboratory located in the United States. You should only output its name and location." },
68
+ ]
69
+ chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
70
+ # tokenize the text
71
+ input_tokens = tokenizer(chat, return_tensors="pt").to(device)
72
+ # generate output tokens
73
+ output = model.generate(**input_tokens,
74
+ max_new_tokens=100)
75
+ # decode output tokens into text
76
+ output = tokenizer.batch_decode(output)
77
+ # print output
78
+ print(output[0])
79
+ ```
80
+
81
+ Expected output:
82
+ ```shell
83
+ <|start_of_role|>user<|end_of_role|>Please list one IBM Research laboratory located in the United States. You should only output its name and location.<|end_of_text|>
84
+ <|start_of_role|>assistant<|end_of_role|>Almaden Research Center, San Jose, California<|end_of_text|>
85
+ ```
86
+
87
+ **Tool-calling:**
88
+ Granite-4.0-1B comes with enhanced tool calling capabilities, enabling seamless integration with external functions and APIs. To define a list of tools please follow OpenAI's function [definition schema](https://platform.openai.com/docs/guides/function-calling?api-mode=responses#defining-functions).
89
+
90
+ This is an example of how to use Granite-4.0-1B model tool-calling ability:
91
+
92
+ ```python
93
+ import torch
94
+ from transformers import AutoModelForCausalLM, AutoTokenizer
95
+
96
+ device = "cuda"
97
+ model_path = "ibm-granite/granite-4.0-1b"
98
+ tokenizer = AutoTokenizer.from_pretrained(model_path)
99
+ # drop device_map if running on CPU
100
+ model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
101
+ model.eval()
102
+
103
+ tools = [
104
+ {
105
+ "type": "function",
106
+ "function": {
107
+ "name": "get_current_weather",
108
+ "description": "Get the current weather for a specified city.",
109
+ "parameters": {
110
+ "type": "object",
111
+ "properties": {
112
+ "city": {
113
+ "type": "string",
114
+ "description": "Name of the city"
115
+ }
116
+ },
117
+ "required": ["city"]
118
+ }
119
+ }
120
+ }
121
+ ]
122
+
123
+ # change input text as desired
124
+ chat = [
125
+ { "role": "user", "content": "What's the weather like in Boston right now?" },
126
+ ]
127
+ chat = tokenizer.apply_chat_template(chat, \
128
+ tokenize=False, \
129
+ tools=tools, \
130
+ add_generation_prompt=True)
131
+ # tokenize the text
132
+ input_tokens = tokenizer(chat, return_tensors="pt").to(device)
133
+ # generate output tokens
134
+ output = model.generate(**input_tokens,
135
+ max_new_tokens=100)
136
+ # decode output tokens into text
137
+ output = tokenizer.batch_decode(output)
138
+ # print output
139
+ print(output[0])
140
+ ```
141
+
142
+ Expected output:
143
+ ```shell
144
+ <|start_of_role|>system<|end_of_role|>You are a helpful assistant with access to the following tools. You may call one or more tools to assist with the user query.
145
+
146
+ You are provided with function signatures within <tools></tools> XML tags:
147
+ <tools>
148
+ {"type": "function", "function": {"name": "get_current_weather", "description": "Get the current weather for a specified city.", "parameters": {"type": "object", "properties": {"city": {"type": "string", "description": "Name of the city"}}, "required": ["city"]}}}
149
+ </tools>
150
+
151
+ For each tool call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:
152
+ <tool_call>
153
+ {"name": <function-name>, "arguments": <args-json-object>}
154
+ </tool_call>. If a tool does not exist in the provided list of tools, notify the user that you do not have the ability to fulfill the request.<|end_of_text|>
155
+ <|start_of_role|>user<|end_of_role|>What's the weather like in Boston right now?<|end_of_text|>
156
+ <|start_of_role|>assistant<|end_of_role|><tool_call>
157
+ {"name": "get_current_weather", "arguments": {"city": "Boston"}}
158
+ </tool_call><|end_of_text|>
159
+ ```
160
+
161
+ <!-- **Retrieval Augmented Generation:**
162
+ *Coming soon* -->
163
+
164
+ **Evaluation Results:**
165
+
166
+ <table>
167
+ <thead>
168
+ <tr>
169
+ <th style="text-align:left; background-color: #001d6c; color: white;">Benchmarks</th>
170
+ <th style="text-align:left; background-color: #001d6c; color: white;">Metric</th>
171
+ <th style="text-align:center; background-color: #001d6c; color: white;">350M Dense</th>
172
+ <th style="text-align:center; background-color: #001d6c; color: white;">H 350M Dense</th>
173
+ <th style="text-align:center; background-color: #001d6c; color: white;">1B Dense</th>
174
+ <th style="text-align:center; background-color: #001d6c; color: white;">H 1B Dense</th>
175
+ </tr>
176
+ </thead>
177
+ <tbody>
178
+ <tr>
179
+ <td colspan="6" style="text-align:center; background-color: #FFFFFF; color: #2D2D2D; font-style:italic;">
180
+ General Tasks
181
+ </td>
182
+ </tr>
183
+ <tr>
184
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">MMLU</td>
185
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">5-shot</td>
186
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">35.01</td>
187
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">36.21</td>
188
+ <td style="text-align:right; background-color: #DAE8FF; color: #2D2D2D;">59.39</td>
189
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">59.74</td>
190
+ </tr>
191
+ <tr>
192
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">MMLU-Pro</td>
193
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">5-shot, CoT</td>
194
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">12.13</td>
195
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">14.38</td>
196
+ <td style="text-align:right; background-color: #DAE8FF; color: #2D2D2D;">34.02</td>
197
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">32.86</td>
198
+ </tr>
199
+ <tr>
200
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">BBH</td>
201
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">3-shot, CoT</td>
202
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">33.07</td>
203
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">33.28</td>
204
+ <td style="text-align:right; background-color: #DAE8FF; color: #2D2D2D;">60.37</td>
205
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">59.68</td>
206
+ </tr>
207
+ <tr>
208
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">AGI EVAL</td>
209
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">0-shot, CoT</td>
210
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">26.22</td>
211
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">29.61</td>
212
+ <td style="text-align:right; background-color: #DAE8FF; color: #2D2D2D;">49.22</td>
213
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">52.44</td>
214
+ </tr>
215
+ <tr>
216
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">GPQA</td>
217
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">0-shot, CoT</td>
218
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">24.11</td>
219
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">26.12</td>
220
+ <td style="text-align:right; background-color: #DAE8FF; color: #2D2D2D;">29.91</td>
221
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">29.69</td>
222
+ </tr>
223
+ <tr>
224
+ <td colspan="6" style="text-align:center; background-color: #FFFFFF; color: #2D2D2D; font-style:italic;">
225
+ Alignment Tasks
226
+ </td>
227
+ </tr>
228
+ <tr>
229
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">IFEval</td>
230
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">Instruct, Strict</td>
231
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">61.63</td>
232
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">67.63</td>
233
+ <td style="text-align:right; background-color: #DAE8FF; color: #2D2D2D;">80.82</td>
234
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">82.37</td>
235
+ </tr>
236
+ <tr>
237
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">IFEval</td>
238
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">Prompt, Strict</td>
239
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">49.17</td>
240
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">55.64</td>
241
+ <td style="text-align:right; background-color: #DAE8FF; color: #2D2D2D;">73.94</td>
242
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">74.68</td>
243
+ </tr>
244
+ <tr>
245
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">IFEval</td>
246
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">Average</td>
247
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">55.4</td>
248
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">61.63</td>
249
+ <td style="text-align:right; background-color: #DAE8FF; color: #2D2D2D;">77.38</td>
250
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">78.53</td>
251
+ </tr>
252
+ <tr>
253
+ <td colspan="6" style="text-align:center; background-color: #FFFFFF; color: #2D2D2D; font-style:italic;">
254
+ Math Tasks
255
+ </td>
256
+ </tr>
257
+ <tr>
258
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">GSM8K</td>
259
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">8-shot</td>
260
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">30.71</td>
261
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">39.27</td>
262
+ <td style="text-align:right; background-color: #DAE8FF; color: #2D2D2D;">76.35</td>
263
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">69.83</td>
264
+ </tr>
265
+ <tr>
266
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">GSM Symbolic</td>
267
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">8-shot</td>
268
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">26.76</td>
269
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">33.7</td>
270
+ <td style="text-align:right; background-color: #DAE8FF; color: #2D2D2D;">72.3</td>
271
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">65.72</td>
272
+ </tr>
273
+ <tr>
274
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">Minerva Math</td>
275
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">0-shot, CoT</td>
276
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">13.04</td>
277
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">5.76</td>
278
+ <td style="text-align:right; background-color: #DAE8FF; color: #2D2D2D;">45.28</td>
279
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">49.4</td>
280
+ </tr>
281
+ <tr>
282
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">DeepMind Math</td>
283
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">0-shot, CoT</td>
284
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">8.45</td>
285
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">6.2</td>
286
+ <td style="text-align:right; background-color: #DAE8FF; color: #2D2D2D;">34</td>
287
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">34.98</td>
288
+ </tr>
289
+ <tr>
290
+ <td colspan="6" style="text-align:center; background-color: #FFFFFF; color: #2D2D2D; font-style:italic;">
291
+ Code Tasks
292
+ </td>
293
+ </tr>
294
+ <tr>
295
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">HumanEval</td>
296
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">pass@1</td>
297
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">39</td>
298
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">38</td>
299
+ <td style="text-align:right; background-color: #DAE8FF; color: #2D2D2D;">74</td>
300
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">73</td>
301
+ </tr>
302
+ <tr>
303
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">HumanEval+</td>
304
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">pass@1</td>
305
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">37</td>
306
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">35</td>
307
+ <td style="text-align:right; background-color: #DAE8FF; color: #2D2D2D;">69</td>
308
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">68</td>
309
+ </tr>
310
+ <tr>
311
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">MBPP</td>
312
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">pass@1</td>
313
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">48</td>
314
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">49</td>
315
+ <td style="text-align:right; background-color: #DAE8FF; color: #2D2D2D;">65</td>
316
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">69</td>
317
+ </tr>
318
+ <tr>
319
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">MBPP+</td>
320
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">pass@1</td>
321
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">38</td>
322
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">44</td>
323
+ <td style="text-align:right; background-color: #DAE8FF; color: #2D2D2D;">57</td>
324
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">60</td>
325
+ </tr>
326
+ <tr>
327
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">CRUXEval-O</td>
328
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">pass@1</td>
329
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">23.75</td>
330
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">25.5</td>
331
+ <td style="text-align:right; background-color: #DAE8FF; color: #2D2D2D;">33.13</td>
332
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">36</td>
333
+ </tr>
334
+ <tr>
335
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">BigCodeBench</td>
336
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">pass@1</td>
337
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">11.14</td>
338
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">11.23</td>
339
+ <td style="text-align:right; background-color: #DAE8FF; color: #2D2D2D;">30.18</td>
340
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">29.12</td>
341
+ </tr>
342
+ <tr>
343
+ <td colspan="6" style="text-align:center; background-color: #FFFFFF; color: #2D2D2D; font-style:italic;">
344
+ Tool Calling Tasks
345
+ </td>
346
+ </tr>
347
+ <tr>
348
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">BFCL v3</td>
349
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;"></td>
350
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">39.32</td>
351
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">43.32</td>
352
+ <td style="text-align:right; background-color: #DAE8FF; color: #2D2D2D;">54.82</td>
353
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">50.21</td>
354
+ </tr>
355
+ <tr>
356
+ <td colspan="6" style="text-align:center; background-color: #FFFFFF; color: #2D2D2D; font-style:italic;">
357
+ Multilingual Tasks
358
+ </td>
359
+ </tr>
360
+ <tr>
361
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">MULTIPLE</td>
362
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">pass@1</td>
363
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">15.99</td>
364
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">14.31</td>
365
+ <td style="text-align:right; background-color: #DAE8FF; color: #2D2D2D;">32.24</td>
366
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">36.11</td>
367
+ </tr>
368
+ <tr>
369
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">MMMLU</td>
370
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">5-shot</td>
371
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">28.23</td>
372
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">27.95</td>
373
+ <td style="text-align:right; background-color: #DAE8FF; color: #2D2D2D;">45</td>
374
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">49.43</td>
375
+ </tr>
376
+ <tr>
377
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">INCLUDE</td>
378
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">5-shot</td>
379
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">27.74</td>
380
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">27.09</td>
381
+ <td style="text-align:right; background-color: #DAE8FF; color: #2D2D2D;">42.12</td>
382
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">43.35</td>
383
+ </tr>
384
+ <tr>
385
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">MGSM</td>
386
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">8-shot</td>
387
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">14.72</td>
388
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">16.16</td>
389
+ <td style="text-align:right; background-color: #DAE8FF; color: #2D2D2D;">37.84</td>
390
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">27.52</td>
391
+ </tr>
392
+ <tr>
393
+ <td colspan="6" style="text-align:center; background-color: #FFFFFF; color: #2D2D2D; font-style:italic;">
394
+ Safety
395
+ </td>
396
+ </tr>
397
+ <tr>
398
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">SALAD-Bench</td>
399
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;"></td>
400
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">97.12</td>
401
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">96.55</td>
402
+ <td style="text-align:right; background-color: #DAE8FF; color: #2D2D2D;">93.44</td>
403
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">96.4</td>
404
+ </tr>
405
+ <tr>
406
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">AttaQ</td>
407
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;"></td>
408
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">82.53</td>
409
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">81.76</td>
410
+ <td style="text-align:right; background-color: #DAE8FF; color: #2D2D2D;">85.26</td>
411
+ <td style="text-align:right; background-color: #FFFFFF; color: #2D2D2D;">82.85</td>
412
+ </tr>
413
+ </tbody></table>
414
+
415
+ <table>
416
+ <caption><b>Multilingual Benchmarks and thr included languages:</b></caption>
417
+ <thead>
418
+ <tr>
419
+ <th style="text-align:left; background-color: #001d6c; color: white;">Benchmarks</th>
420
+ <th style="text-align:left; background-color: #001d6c; color: white;"># Langs</th>
421
+ <th style="text-align:center; background-color: #001d6c; color: white;">Languages</th>
422
+ </tr>
423
+ </thead>
424
+ <tbody>
425
+ <tr>
426
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">MMMLU</td>
427
+ <td style="text-align:center; background-color: #FFFFFF; color: #2D2D2D;">11</td>
428
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">ar, de, en, es, fr, ja, ko, pt, zh, bn, hi</td>
429
+ </tr>
430
+ <tr>
431
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">INCLUDE</td>
432
+ <td style="text-align:center; background-color: #FFFFFF; color: #2D2D2D;">14</td>
433
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">hi, bn, ta, te, ar, de, es, fr, it, ja, ko, nl, pt, zh</td>
434
+
435
+ </tr>
436
+ <tr>
437
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">MGSM</td>
438
+ <td style="text-align:center; background-color: #FFFFFF; color: #2D2D2D;">5</td>
439
+ <td style="text-align:left; background-color: #FFFFFF; color: #2D2D2D;">en, es, fr, ja, zh</td>
440
+ </tr>
441
+ </tbody>
442
+ </table>
443
+
444
+ **Model Architecture:**
445
+
446
+ <!-- Granite-4.0-1B baseline is based on a decoder-only dense transformer architecture. Core components of this architecture are: GQA, Mamba2, MLP with SwiGLU, RMSNorm, and shared input/output embeddings. -->
447
+
448
+ Granite-4.0-1B baseline is based on a decoder-only dense transformer architecture. Core components of this architecture are: GQA, MLP with SwiGLU, RMSNorm, and shared input/output embeddings.
449
+
450
+ <table>
451
+ <thead>
452
+ <tr>
453
+ <th style="text-align:left; background-color: #001d6c; color: white;">Model</th>
454
+ <th style="text-align:center; background-color: #001d6c; color: white;">350M Dense</th>
455
+ <th style="text-align:center; background-color: #001d6c; color: white;">H 350M Dense</th>
456
+ <th style="text-align:center; background-color: #001d6c; color: white;">1B Dense</th>
457
+ <th style="text-align:center; background-color: #001d6c; color: white;">H 1B Dense</th>
458
+ </tr></thead>
459
+ <tbody>
460
+ <tr>
461
+ <td style="text-align:left; background-color: #FFFFFF; color: black;">Embedding size</td>
462
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">1024</td>
463
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">768</td>
464
+ <td style="text-align:center; background-color: #DAE8FF; color: black;">2048</td>
465
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">1536</td>
466
+ </tr>
467
+ <tr>
468
+ <td style="text-align:left; background-color: #FFFFFF; color: black;">Number of layers</td>
469
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">28 attention</td>
470
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">4 attention / 28 Mamba2</td>
471
+ <td style="text-align:center; background-color: #DAE8FF; color: black;">40 attention</td>
472
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">4 attention / 36 Mamba2</td>
473
+ </tr>
474
+ <tr>
475
+ <td style="text-align:left; background-color: #FFFFFF; color: black;">Attention head size</td>
476
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">64</td>
477
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">64</td>
478
+ <td style="text-align:center; background-color: #DAE8FF; color: black;">128</td>
479
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">128</td>
480
+ </tr>
481
+ <tr>
482
+ <td style="text-align:left; background-color: #FFFFFF; color: black;">Number of attention heads</td>
483
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">16</td>
484
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">12</td>
485
+ <td style="text-align:center; background-color: #DAE8FF; color: black;">16</td>
486
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">12</td>
487
+ </tr>
488
+ <tr>
489
+ <td style="text-align:left; background-color: #FFFFFF; color: black;">Number of KV heads</td>
490
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">4</td>
491
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">4</td>
492
+ <td style="text-align:center; background-color: #DAE8FF; color: black;">4</td>
493
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">4</td>
494
+ </tr>
495
+ <tr>
496
+ <td style="text-align:left; background-color: #FFFFFF; color: black;">Mamba2 state size</td>
497
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">-</td>
498
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">128</td>
499
+ <td style="text-align:center; background-color: #DAE8FF; color: black;">-</td>
500
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">128</td>
501
+ </tr>
502
+ <tr>
503
+ <td style="text-align:left; background-color: #FFFFFF; color: black;">Number of Mamba2 heads</td>
504
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">-</td>
505
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">48</td>
506
+ <td style="text-align:center; background-color: #DAE8FF; color: black;">-</td>
507
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">48</td>
508
+ </tr>
509
+
510
+ <tr>
511
+ <td style="text-align:left; background-color: #FFFFFF; color: black;">MLP / Shared expert hidden size</td>
512
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">2048</td>
513
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">2048</td>
514
+ <td style="text-align:center; background-color: #DAE8FF; color: black;">4096</td>
515
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">4096</td>
516
+ </tr>
517
+ <tr>
518
+ <td style="text-align:left; background-color: #FFFFFF; color: black;">Num. Experts</td>
519
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">-</td>
520
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">-</td>
521
+ <td style="text-align:center; background-color: #DAE8FF; color: black;">-</td>
522
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">-</td>
523
+ </tr>
524
+ <tr>
525
+ <td style="text-align:left; background-color: #FFFFFF; color: black;">Num. active Experts</td>
526
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">-</td>
527
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">-</td>
528
+ <td style="text-align:center; background-color: #DAE8FF; color: black;">-</td>
529
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">-</td>
530
+ </tr>
531
+ <tr>
532
+ <td style="text-align:left; background-color: #FFFFFF; color: black;">Expert hidden size</td>
533
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">-</td>
534
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">-</td>
535
+ <td style="text-align:center; background-color: #DAE8FF; color: black;">-</td>
536
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">-</td>
537
+ </tr>
538
+ <tr>
539
+ <td style="text-align:left; background-color: #FFFFFF; color: black;">MLP activation</td>
540
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">SwiGLU</td>
541
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">SwiGLU</td>
542
+ <td style="text-align:center; background-color: #DAE8FF; color: black;">SwiGLU</td>
543
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">SwiGLU</td>
544
+ </tr>
545
+
546
+ <tr>
547
+ <td style="text-align:left; background-color: #FFFFFF; color: black;">Sequence length</td>
548
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">32K</td>
549
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">32K</td>
550
+ <td style="text-align:center; background-color: #DAE8FF; color: black;">128K</td>
551
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">128K</td>
552
+ </tr>
553
+ <tr>
554
+ <td style="text-align:left; background-color: #FFFFFF; color: black;">Position embedding</td>
555
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">RoPE</td>
556
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">NoPE</td>
557
+ <td style="text-align:center; background-color: #DAE8FF; color: black;">RoPE</td>
558
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">NoPE</td>
559
+ </tr>
560
+ <tr>
561
+ <td style="text-align:left; background-color: #FFFFFF; color: black;"># Parameters</td>
562
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">350M</td>
563
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">340M</td>
564
+ <td style="text-align:center; background-color: #DAE8FF; color: black;">1.6B</td>
565
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">1.5B</td>
566
+ </tr>
567
+ <tr>
568
+ <td style="text-align:left; background-color: #FFFFFF; color: black;"># Active parameters</td>
569
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">350M</td>
570
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">340M</td>
571
+ <td style="text-align:center; background-color: #DAE8FF; color: black;">1.6B</td>
572
+ <td style="text-align:center; background-color: #FFFFFF; color: black;">1.5B</td>
573
+ </tr>
574
+ </tbody></table>
575
+
576
+ **Training Data:**
577
+ Overall, our SFT data is largely comprised of three key sources: (1) publicly available datasets with permissive license, (2) internal synthetic data targeting specific capabilities, and (3) a select set of human-curated data.
578
+
579
+ **Infrastructure:**
580
+ We trained the Granite 4.0 Nano Language Models utilizing an NVIDIA GB200 NVL72 cluster hosted in CoreWeave. Intra-rack communication occurs via the 72-GPU NVLink domain, and a non-blocking, full Fat-Tree NDR 400 Gb/s InfiniBand network provides inter-rack communication. This cluster provides a scalable and efficient infrastructure for training our models over thousands of GPUs.
581
+
582
+ **Ethical Considerations and Limitations:**
583
+ Granite 4.0 Nano Instruct Models are primarily finetuned using instruction-response pairs mostly in English, but also multilingual data covering multiple languages. Although this model can handle multilingual dialog use cases, its performance might not be similar to English tasks. In such case, introducing a small number of examples (few-shot) can help the model in generating more accurate outputs. While this model has been aligned by keeping safety in consideration, the model may in some cases produce inaccurate, biased, or unsafe responses to user prompts. So we urge the community to use this model with proper safety testing and tuning tailored for their specific tasks.
584
+
585
+ **Resources**
586
+ - ⭐️ Learn about the latest updates with Granite: https://www.ibm.com/granite
587
+ - 📄 Get started with tutorials, best practices, and prompt engineering advice: https://www.ibm.com/granite/docs/
588
+ - 💡 Learn about the latest Granite learning resources: https://ibm.biz/granite-learning-resources
589
+
590
+ <!-- ## Citation
591
+ ```
592
+ @misc{granite-models,
593
+ author = {author 1, author2, ...},
594
+ title = {},
595
+ journal = {},
596
+ volume = {},
597
+ year = {2024},
598
+ url = {https://arxiv.org/abs/0000.00000},
599
+ }
600
+ ``` -->
chat_template.jinja ADDED
@@ -0,0 +1,118 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {%- set tools_system_message_prefix = 'You are a helpful assistant with access to the following tools. You may call one or more tools to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>' %}
2
+ {%- set tools_system_message_suffix = '\n</tools>\n\nFor each tool call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call>. If a tool does not exist in the provided list of tools, notify the user that you do not have the ability to fulfill the request.' %}
3
+ {%- set documents_system_message_prefix = 'You are a helpful assistant with access to the following documents. You may use one or more documents to assist with the user query.\n\nYou are given a list of documents within <documents></documents> XML tags:\n<documents>' %}
4
+ {%- set documents_system_message_suffix = '\n</documents>\n\nWrite the response to the user\'s input by strictly aligning with the facts in the provided documents. If the information needed to answer the question is not available in the documents, inform the user that the question cannot be answered based on the available data.' %}
5
+ {%- set g4_default_system_message = 'You are a helpful assistant. Please ensure responses are professional, accurate, and safe.' %}
6
+ {%- if available_tools is defined and available_tools %}
7
+ {%- set tools = available_tools %}
8
+ {%- endif %}
9
+ {%- set ns = namespace(tools_system_message=tools_system_message_prefix,
10
+ documents_system_message=documents_system_message_prefix,
11
+ default_system_message=g4_default_system_message,
12
+ system_message=''
13
+ ) %}
14
+ {%- if tools %}
15
+ {%- for tool in tools %}
16
+ {%- set ns.tools_system_message = ns.tools_system_message + '\n' + (tool | tojson) %}
17
+ {%- endfor %}
18
+ {%- set ns.tools_system_message = ns.tools_system_message + tools_system_message_suffix %}
19
+ {%- else %}
20
+ {%- set ns.tools_system_message = '' %}
21
+ {%- endif %}
22
+ {%- if documents %}
23
+ {%- for document in documents %}
24
+ {%- set ns.documents_system_message = ns.documents_system_message + '\n' + (document | tojson) %}
25
+ {%- endfor %}
26
+ {%- set ns.documents_system_message = ns.documents_system_message + documents_system_message_suffix %}
27
+ {%- else %}
28
+ {%- set ns.documents_system_message = '' %}
29
+ {%- endif %}
30
+ {%- if messages[0].role == 'system' %}
31
+ {%- if messages[0].content is string %}
32
+ {%- set ns.system_message = messages[0].content %}
33
+ {%- elif messages[0].content is iterable %}
34
+ {%- for entry in messages[0].content %}
35
+ {%- if entry.type== 'text' %}
36
+ {%- if ns.system_message != '' %}
37
+ {%- set ns.system_message = ns.system_message + '\n' %}
38
+ {%- endif %}
39
+ {%- set ns.system_message = ns.system_message + entry.text %}
40
+ {%- endif %}
41
+ {%- endfor %}
42
+ {%- endif %}
43
+ {%- if tools and documents %}
44
+ {%- set ns.system_message = ns.system_message + '\n\n' + ns.tools_system_message + '\n\n' + ns.documents_system_message %}
45
+ {%- elif tools %}
46
+ {%- set ns.system_message = ns.system_message + '\n\n' + ns.tools_system_message %}
47
+ {%- elif documents %}
48
+ {%- set ns.system_message = ns.system_message + '\n\n' + ns.documents_system_message %}
49
+ {%- endif %}
50
+ {%- else %}
51
+ {%- if tools and documents %}
52
+ {%- set ns.system_message = ns.tools_system_message + '\n\n' + ns.documents_system_message %}
53
+ {%- elif tools %}
54
+ {%- set ns.system_message = ns.tools_system_message %}
55
+ {%- elif documents %}
56
+ {%- set ns.system_message = ns.documents_system_message %}
57
+ {%- endif %}
58
+ {%- endif %}
59
+ {%- if ns.system_message %}
60
+ {{- '<|start_of_role|>system<|end_of_role|>' + ns.system_message + '<|end_of_text|>\n' }}
61
+ {%- else %}
62
+ {{- '<|start_of_role|>system<|end_of_role|>' + ns.default_system_message + '<|end_of_text|>\n' }}
63
+ {%- endif %}
64
+ {%- for message in messages %}
65
+ {%- set content = namespace(val='') %}
66
+ {%- if message.content is string %}
67
+ {%- set content.val = message.content %}
68
+ {%- else %}
69
+ {%- if message.content is iterable %}
70
+ {%- for entry in message.content %}
71
+ {%- if entry.type== 'text' %}
72
+ {%- if content.val != '' %}
73
+ {%- set content.val = content.val + '\n' %}
74
+ {%- endif %}
75
+ {%- set content.val = content.val + entry.text %}
76
+ {%- endif %}
77
+ {%- endfor %}
78
+ {%- endif %}
79
+ {%- endif %}
80
+ {%- if (message.role == 'user') or (message.role == 'system' and not loop.first) %}
81
+ {{- '<|start_of_role|>' + message.role + '<|end_of_role|>' + content.val + '<|end_of_text|>\n' }}
82
+ {%- elif message.role == 'assistant' %}
83
+ {{- '<|start_of_role|>' + message.role + '<|end_of_role|>' + content.val }}
84
+ {%- if message.tool_calls %}
85
+ {%- for tool_call in message.tool_calls %}
86
+ {%- if (loop.first and content.val) or (not loop.first) %}
87
+ {{- '\n' }}
88
+ {%- endif %}
89
+ {%- if tool_call.function %}
90
+ {%- set tool_call = tool_call.function %}
91
+ {%- endif %}
92
+ {{- '<tool_call>\n{"name": "' }}
93
+ {{- tool_call.name }}
94
+ {{- '", "arguments": ' }}
95
+ {%- if tool_call.arguments is string %}
96
+ {{- tool_call.arguments }}
97
+ {%- else %}
98
+ {{- tool_call.arguments | tojson }}
99
+ {%- endif %}
100
+ {{- '}\n</tool_call>' }}
101
+ {%- endfor %}
102
+ {%- endif %}
103
+ {{- '<|end_of_text|>\n' }}
104
+ {%- elif message.role == 'tool' %}
105
+ {%- if loop.first or (messages[loop.index0 - 1].role != 'tool') %}
106
+ {{- '<|start_of_role|>user<|end_of_role|>' }}
107
+ {%- endif %}
108
+ {{- '\n<tool_response>\n' }}
109
+ {{- content.val }}
110
+ {{- '\n</tool_response>' }}
111
+ {%- if loop.last or (messages[loop.index0 + 1].role != 'tool') %}
112
+ {{- '<|end_of_text|>\n' }}
113
+ {%- endif %}
114
+ {%- endif %}
115
+ {%- endfor %}
116
+ {%- if add_generation_prompt %}
117
+ {{- '<|start_of_role|>assistant<|end_of_role|>' }}
118
+ {%- endif %}
config.json ADDED
@@ -0,0 +1,90 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "GraniteMoeHybridForCausalLM"
4
+ ],
5
+ "attention_bias": false,
6
+ "attention_dropout": 0.0,
7
+ "attention_multiplier": 0.0078125,
8
+ "bos_token_id": 100257,
9
+ "embedding_multiplier": 12,
10
+ "eos_token_id": 100257,
11
+ "hidden_act": "silu",
12
+ "hidden_size": 2048,
13
+ "init_method": "mup",
14
+ "initializer_range": 0.1,
15
+ "intermediate_size": 4096,
16
+ "layer_types": [
17
+ "attention",
18
+ "attention",
19
+ "attention",
20
+ "attention",
21
+ "attention",
22
+ "attention",
23
+ "attention",
24
+ "attention",
25
+ "attention",
26
+ "attention",
27
+ "attention",
28
+ "attention",
29
+ "attention",
30
+ "attention",
31
+ "attention",
32
+ "attention",
33
+ "attention",
34
+ "attention",
35
+ "attention",
36
+ "attention",
37
+ "attention",
38
+ "attention",
39
+ "attention",
40
+ "attention",
41
+ "attention",
42
+ "attention",
43
+ "attention",
44
+ "attention",
45
+ "attention",
46
+ "attention",
47
+ "attention",
48
+ "attention",
49
+ "attention",
50
+ "attention",
51
+ "attention",
52
+ "attention",
53
+ "attention",
54
+ "attention",
55
+ "attention",
56
+ "attention"
57
+ ],
58
+ "logits_scaling": 8,
59
+ "mamba_chunk_size": 256,
60
+ "mamba_conv_bias": true,
61
+ "mamba_d_conv": 4,
62
+ "mamba_d_head": 32,
63
+ "mamba_d_state": 256,
64
+ "mamba_expand": 2,
65
+ "mamba_n_groups": 1,
66
+ "mamba_n_heads": 128,
67
+ "mamba_proj_bias": false,
68
+ "max_position_embeddings": 131072,
69
+ "model_type": "granitemoehybrid",
70
+ "normalization_function": "rmsnorm",
71
+ "num_attention_heads": 16,
72
+ "num_experts_per_tok": 0,
73
+ "num_hidden_layers": 40,
74
+ "num_key_value_heads": 4,
75
+ "num_local_experts": 0,
76
+ "output_router_logits": false,
77
+ "pad_token_id": 100256,
78
+ "position_embedding_type": "rope",
79
+ "residual_multiplier": 0.22,
80
+ "rms_norm_eps": 1e-05,
81
+ "rope_scaling": null,
82
+ "rope_theta": 10000000,
83
+ "router_aux_loss_coef": 0.01,
84
+ "shared_intermediate_size": 4096,
85
+ "tie_word_embeddings": true,
86
+ "torch_dtype": "bfloat16",
87
+ "transformers_version": "4.56.0",
88
+ "use_cache": true,
89
+ "vocab_size": 100352
90
+ }
generation_config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 100257,
4
+ "eos_token_id": 100257,
5
+ "pad_token_id": 100256,
6
+ "transformers_version": "4.56.0"
7
+ }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cc2def9a94efd3a18b21080b053fa64a3f0c329eef357233f0392940179058f4
3
+ size 3263538464
model.sig ADDED
@@ -0,0 +1 @@
 
 
1
+ {"mediaType":"application/vnd.dev.sigstore.bundle.v0.3+json","verificationMaterial":{"certificate":{"rawBytes":"MIIC5jCCAmugAwIBAgIUaE9dbFHOfVSQw2d2FNupLwr9Dt8wCgYIKoZIzj0EAwMwNzEVMBMGA1UEChMMc2lnc3RvcmUuZGV2MR4wHAYDVQQDExVzaWdzdG9yZS1pbnRlcm1lZGlhdGUwHhcNMjUxMDIzMDgzNTQwWhcNMjUxMDIzMDg0NTQwWjAAMFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEPwH/6wSsyGVGiOoDtSZlcjmIuqukkd/d0Gn0DqoW2zwURMuBMeb3ODwv7M5pa0agmcMY20UOYX+86Bc4SchEbKOCAYowggGGMA4GA1UdDwEB/wQEAwIHgDATBgNVHSUEDDAKBggrBgEFBQcDAzAdBgNVHQ4EFgQUAsqCjUHZDHRNipxL3yALfyc1Us0wHwYDVR0jBBgwFoAU39Ppz1YkEZb5qNjpKFWixi4YZD8wJAYDVR0RAQH/BBowGIEWR3Jhbml0ZS12ZXJpZnlAaWJtLmNvbTA0BgorBgEEAYO/MAEBBCZodHRwczovL3NpZ3N0b3JlLnZlcmlmeS5pYm0uY29tL29hdXRoMjA2BgorBgEEAYO/MAEIBCgMJmh0dHBzOi8vc2lnc3RvcmUudmVyaWZ5LmlibS5jb20vb2F1dGgyMIGKBgorBgEEAdZ5AgQCBHwEegB4AHYA3T0wasbHETJjGR4cmWc3AqJKXrjePK3/h4pygC8p7o4AAAGaEDWd2QAABAMARzBFAiEAoGD0mI8PUexCcv9mB9JMC0jzItBjeM+g/6d1uDlKzvMCIF9AhGZTlP8Zu91fQWVqt/UVf5nb+qUsVCeFksA3cejzMAoGCCqGSM49BAMDA2kAMGYCMQDdLLDyCy3QBVDgycp1PmqQbaDF20WTBQGysQOZ/mib6OeSsCEGPicFdJ67+kBMMYQCMQC59dBTK/vAeNLaAhdBS26ti3pgE+itgSImbPCPoDRD7DnP7hHTDnqPy+ogwQ2Q0zU="},"tlogEntries":[{"logIndex":"632554867","logId":{"keyId":"wNI9atQGlz+VWfO6LRygH4QUfY/8W4RFwiT5i5WRgB0="},"kindVersion":{"kind":"dsse","version":"0.0.1"},"integratedTime":"1761208541","inclusionPromise":{"signedEntryTimestamp":"MEYCIQCCREcLClIpNXt/RKiQJkQqf/M3D4Was9CqhL4P9TJx/wIhAL6EP0JT+7orS6cAC/xg7d8hUZb3zioQh6xoA6wWLkpI"},"inclusionProof":{"logIndex":"510650605","rootHash":"JtM5CKsiU56czEOmDJbSg79W1PJlhOBoZ4wih6OR5G4=","treeSize":"510650616","hashes":["zU2pDIQW9/Gxi0zn/9ERnvBTSLvmzle2REqVlWSrRMI=","oP1mUWMurUzKhUj0i8Uha5jdRIonbQtejhO23BfeQqo=","DMmbunYZwR1mdJFGLjkENVBFF5o0nYPDHXMeVGPsHDY=","fAWo5UeUntSXFeKAa88rbFybuuKy5gGWaSRXEShTqfM=","pyZIr0Q0G3CA7EZziPflD7OtauFtMOSua72/uNaC7Ig=","Vlu1uSpLigVxZAihf8wvKNKvlub8zf3zj8n670ACQN4=","vElTcXFdYQz+B/uiBBQoqjjmdtx35dr1NZ249MSMW/o=","G4+JCvKDURrVZVI7jKmcsJ5yhzYUqLaE7ZtpzoT/J8c=","xp4fyXY4mPOTO07Ojr3hUM3+rriPrkyhD5inJHZXZJ0=","b3SRJfudMpM7N0ZWTGrz8JJRf7VGIO5nrx8Tk3x2Tpc=","od4d4SV8LrOFdkCCDMMDqWNe+WLeyPKGbZjk5ctHMTU=","L7JnNVHMccd20Jd28valUrS+BdGl36tIPWAdM5xWgZc=","AN0KeW4SiQgCqMO00VyTwY/Q5ZXeEmoXiFk78nE2Mlk=","WIkYe49q1liNmi8dK4eDvdF9FeaoePEGwT6ELFCP11Q=","oZ/RwNleZYUr1qtIU7XHivJxFppneqXxNo6t8Hk2ZO8=","Qbl2iYstV2p12QVovHK/LTkkp/BG9jG7BFB4C1wlGvs=","XwGFVrm8LqUnVl0BQMp4ekzv78WkRYHQ4lUBS9A0WeU=","D80xacpqDQIvIRMD3tbcxX9uf1A8e/fviKzJVQvMCEE=","HAOTg+Fg3H1Ej24mw+P9lXfi+4mPL5EKwAd4aNQOV6M=","2Wv4GiithwNukRKV06clevnQQYCzXmSS/+/OJtXgsXQ=","1mfy94KpcItqshH9+gwqV6jccupcaMpVsF28New8zDY=","vS7O4ozHIQZJWBiov+mkpI27GE8zAmVCEkRcP3NDyNE="],"checkpoint":{"envelope":"rekor.sigstore.dev - 1193050959916656506\n510650616\nJtM5CKsiU56czEOmDJbSg79W1PJlhOBoZ4wih6OR5G4=\n\n— rekor.sigstore.dev wNI9ajBFAiAinkPNCh5ueekITiUBoQ+UQ/qf+Lfaut4RuMM7CXuA9gIhAOvtL3jwvjEv2aQ99T+BapPOfwY8wzKYG+gXST0xFSW0\n"}},"canonicalizedBody":"eyJhcGlWZXJzaW9uIjoiMC4wLjEiLCJraW5kIjoiZHNzZSIsInNwZWMiOnsiZW52ZWxvcGVIYXNoIjp7ImFsZ29yaXRobSI6InNoYTI1NiIsInZhbHVlIjoiYWNlYjJlNjZhYzBlZmQxYWYyNzllNDQyOTU0YjFmYzM2MTU2MzI0NDEzZjhkN2U4MTVkNDk3YmJjMjE4ODEyYiJ9LCJwYXlsb2FkSGFzaCI6eyJhbGdvcml0aG0iOiJzaGEyNTYiLCJ2YWx1ZSI6ImIyY2IyYjIzYWRjMWM5MWZjODYwZTQ1NzkwNzZmMjQzNmZhOGZiZWI0ZTE5MmQyMzA4NDU1YTcwN2Q4NzdiODgifSwic2lnbmF0dXJlcyI6W3sic2lnbmF0dXJlIjoiTUVZQ0lRRDJaTko4NW1HOG5EOWFFSXc3WkdHdE43QTcvbE5tVjBPcWV0cU81dHdiT0FJaEFNMmVWSHF1ZnBFQmlkdHcrd0VmRUphRkFpVHF3bkhISUFvNjgvaXhMM2JpIiwidmVyaWZpZXIiOiJMUzB0TFMxQ1JVZEpUaUJEUlZKVVNVWkpRMEZVUlMwdExTMHRDazFKU1VNMWFrTkRRVzExWjBGM1NVSkJaMGxWWVVVNVpHSkdTRTltVmxOUmR6SmtNa1pPZFhCTWQzSTVSSFE0ZDBObldVbExiMXBKZW1vd1JVRjNUWGNLVG5wRlZrMUNUVWRCTVZWRlEyaE5UV015Ykc1ak0xSjJZMjFWZFZwSFZqSk5ValIzU0VGWlJGWlJVVVJGZUZaNllWZGtlbVJIT1hsYVV6RndZbTVTYkFwamJURnNXa2RzYUdSSFZYZElhR05PVFdwVmVFMUVTWHBOUkdkNlRsUlJkMWRvWTA1TmFsVjRUVVJKZWsxRVp6Qk9WRkYzVjJwQlFVMUdhM2RGZDFsSUNrdHZXa2w2YWpCRFFWRlpTVXR2V2tsNmFqQkVRVkZqUkZGblFVVlFkMGd2Tm5kVGMzbEhWa2RwVDI5RWRGTmFiR05xYlVsMWNYVnJhMlF2WkRCSGJqQUtSSEZ2VnpKNmQxVlNUWFZDVFdWaU0wOUVkM1kzVFRWd1lUQmhaMjFqVFZreU1GVlBXVmdyT0RaQ1l6UlRZMmhGWWt0UFEwRlpiM2RuWjBkSFRVRTBSd3BCTVZWa1JIZEZRaTkzVVVWQmQwbElaMFJCVkVKblRsWklVMVZGUkVSQlMwSm5aM0pDWjBWR1FsRmpSRUY2UVdSQ1owNVdTRkUwUlVablVWVkJjM0ZEQ21wVlNGcEVTRkpPYVhCNFRETjVRVXhtZVdNeFZYTXdkMGgzV1VSV1VqQnFRa0puZDBadlFWVXpPVkJ3ZWpGWmEwVmFZalZ4VG1wd1MwWlhhWGhwTkZrS1drUTRkMHBCV1VSV1VqQlNRVkZJTDBKQ2IzZEhTVVZYVWpOS2FHSnRiREJhVXpFeVdsaEtjRnB1YkVGaFYwcDBURzFPZG1KVVFUQkNaMjl5UW1kRlJRcEJXVTh2VFVGRlFrSkRXbTlrU0ZKM1kzcHZka3d6VG5CYU0wNHdZak5LYkV4dVdteGpiV3h0WlZNMWNGbHRNSFZaTWpsMFRESTVhR1JZVW05TmFrRXlDa0puYjNKQ1owVkZRVmxQTDAxQlJVbENRMmROU20xb01HUklRbnBQYVRoMll6SnNibU16VW5aamJWVjFaRzFXZVdGWFdqVk1iV3hwWWxNMWFtSXlNSFlLWWpKR01XUkhaM2xOU1VkTFFtZHZja0puUlVWQlpGbzFRV2RSUTBKSWQwVmxaMEkwUVVoWlFUTlVNSGRoYzJKSVJWUktha2RTTkdOdFYyTXpRWEZLU3dwWWNtcGxVRXN6TDJnMGNIbG5Remh3TjI4MFFVRkJSMkZGUkZka01sRkJRVUpCVFVGU2VrSkdRV2xGUVc5SFJEQnRTVGhRVldWNFEyTjJPVzFDT1VwTkNrTXdhbnBKZEVKcVpVMHJaeTgyWkRGMVJHeExlblpOUTBsR09VRm9SMXBVYkZBNFduVTVNV1pSVjFaeGRDOVZWbVkxYm1JcmNWVnpWa05sUm10elFUTUtZMlZxZWsxQmIwZERRM0ZIVTAwME9VSkJUVVJCTW10QlRVZFpRMDFSUkdSTVRFUjVRM2t6VVVKV1JHZDVZM0F4VUcxeFVXSmhSRVl5TUZkVVFsRkhlUXB6VVU5YUwyMXBZalpQWlZOelEwVkhVR2xqUm1SS05qY3JhMEpOVFZsUlEwMVJRelU1WkVKVVN5OTJRV1ZPVEdGQmFHUkNVekkyZEdremNHZEZLMmwwQ21kVFNXMWlVRU5RYjBSU1JEZEVibEEzYUVoVVJHNXhVSGtyYjJkM1VUSlJNSHBWUFFvdExTMHRMVVZPUkNCRFJWSlVTVVpKUTBGVVJTMHRMUzB0Q2c9PSJ9XX19"}],"timestampVerificationData":{"rfc3161Timestamps":[{"signedTimestamp":"MIIE6jADAgEAMIIE4QYJKoZIhvcNAQcCoIIE0jCCBM4CAQMxDTALBglghkgBZQMEAgEwgcIGCyqGSIb3DQEJEAEEoIGyBIGvMIGsAgEBBgkrBgEEAYO/MAIwMTANBglghkgBZQMEAgEFAAQglnQTCUxnMyBcFuvqiRGUmFeZfSJn21tGYrFD0mUMq68CFBALm72DqKbOOr899gyKQckInzVpGA8yMDI1MTAyMzA4MzU0MFowAwIBAQIJAKDuNuLm2pGGoDKkMDAuMRUwEwYDVQQKEwxzaWdzdG9yZS5kZXYxFTATBgNVBAMTDHNpZ3N0b3JlLXRzYaCCAhQwggIQMIIBlqADAgECAhQ6E1QvDJBh7rzBQy/Lio6LKiOLDDAKBggqhkjOPQQDAzA5MRUwEwYDVQQKEwxzaWdzdG9yZS5kZXYxIDAeBgNVBAMTF3NpZ3N0b3JlLXRzYS1zZWxmc2lnbmVkMB4XDTI1MDQwODA2NTk0M1oXDTM1MDQwNjA2NTk0M1owLjEVMBMGA1UEChMMc2lnc3RvcmUuZGV2MRUwEwYDVQQDEwxzaWdzdG9yZS10c2EwdjAQBgcqhkjOPQIBBgUrgQQAIgNiAATitrZnyEo2KDZP2QWMIBOgYbfSOTL5ZC/cHMv6Yq+HVIo1H9TC7Cx80KDiyvKhgB3wTqKyi9UDczhqg12b1AOLnRnydMTK+qB8M+1MjBci1+Jb8AV/VXu7CRuQCiPTHFyjajBoMA4GA1UdDwEB/wQEAwIHgDAdBgNVHQ4EFgQUif15Q4fP0GVGwwJGxyxzW3206wMwHwYDVR0jBBgwFoAUmOwB73+7Uf/UlR5vioiYUweJzr8wFgYDVR0lAQH/BAwwCgYIKwYBBQUHAwgwCgYIKoZIzj0EAwMDaAAwZQIwO2mxX/opo7SrIX9QyxfZpJRcpAV2gZOm1AZzR+2rVyy6Uc8Ybp2ybIw13ckH4bcRAjEA5qO8FyOkmYpvg2/7ZNqiPxRzn5vqKHoVcIIqtpKq6l7TvOqzAxxclN7VwTG8e++XMYIB2zCCAdcCAQEwUTA5MRUwEwYDVQQKEwxzaWdzdG9yZS5kZXYxIDAeBgNVBAMTF3NpZ3N0b3JlLXRzYS1zZWxmc2lnbmVkAhQ6E1QvDJBh7rzBQy/Lio6LKiOLDDALBglghkgBZQMEAgGggfwwGgYJKoZIhvcNAQkDMQ0GCyqGSIb3DQEJEAEEMBwGCSqGSIb3DQEJBTEPFw0yNTEwMjMwODM1NDBaMC8GCSqGSIb3DQEJBDEiBCDSBeCnr744vVTPySIdSUTjSaUC6m5tyXnd5Cww3o/ROTCBjgYLKoZIhvcNAQkQAi8xfzB9MHsweQQghfknvAerYsrDtENWwQ78gbLGiD/aernm2HDZ0TrNBbcwVTA9pDswOTEVMBMGA1UEChMMc2lnc3RvcmUuZGV2MSAwHgYDVQQDExdzaWdzdG9yZS10c2Etc2VsZnNpZ25lZAIUOhNULwyQYe68wUMvy4qOiyojiwwwCgYIKoZIzj0EAwIEZzBlAjEAtyugcVClADn2UzghBx+yCZSfDfF63nT3Tiietmepn3hQ78u56oJt+lkGYjQt5+leAjBL+PLdTBKyZyhMIzErkbUdXEbVHSK/d4+6tuCSE/BTkQ2y7KoiZmzYXlNIwFrTfEw="}]}},"dsseEnvelope":{"payload":"ewogICJfdHlwZSI6ICJodHRwczovL2luLXRvdG8uaW8vU3RhdGVtZW50L3YxIiwKICAic3ViamVjdCI6IFsKICAgIHsKICAgICAgIm5hbWUiOiAiZ3Jhbml0ZS00LjAtMWIiLAogICAgICAiZGlnZXN0IjogewogICAgICAgICJzaGEyNTYiOiAiYjBmZWM0Y2ZhM2JjMDE5YzllNDEwZmQzYWJmODUzNzlmZDk3MWQxZWFhNDkzOTViM2VkYjM5OTljYzYyM2MxNiIKICAgICAgfQogICAgfQogIF0sCiAgInByZWRpY2F0ZVR5cGUiOiAiaHR0cHM6Ly9tb2RlbF9zaWduaW5nL3NpZ25hdHVyZS92MS4wIiwKICAicHJlZGljYXRlIjogewogICAgInNlcmlhbGl6YXRpb24iOiB7CiAgICAgICJoYXNoX3R5cGUiOiAic2hhMjU2IiwKICAgICAgImFsbG93X3N5bWxpbmtzIjogZmFsc2UsCiAgICAgICJtZXRob2QiOiAiZmlsZXMiLAogICAgICAiaWdub3JlX3BhdGhzIjogWwogICAgICAgICJtb2RlbC5zaWciLAogICAgICAgICIuZ2l0IiwKICAgICAgICAiLmdpdGF0dHJpYnV0ZXMiLAogICAgICAgICIuZ2l0aWdub3JlIiwKICAgICAgICAiLmdpdGh1YiIKICAgICAgXQogICAgfSwKICAgICJyZXNvdXJjZXMiOiBbCiAgICAgIHsKICAgICAgICAibmFtZSI6ICJSRUFETUUubWQiLAogICAgICAgICJhbGdvcml0aG0iOiAic2hhMjU2IiwKICAgICAgICAiZGlnZXN0IjogIjk3MDI2Mjk3YzMyZjY5ODA4M2RkYzdhZDBmNGNlMmY4Nzc2MWQyOGZjNTEyMmE2NjM1MTVkZmNiOWRkY2IwZWIiCiAgICAgIH0sCiAgICAgIHsKICAgICAgICAibmFtZSI6ICJjaGF0X3RlbXBsYXRlLmppbmphIiwKICAgICAgICAiYWxnb3JpdGhtIjogInNoYTI1NiIsCiAgICAgICAgImRpZ2VzdCI6ICI5NTI0ZGY2N2I3N2E3YjI1YTJkZmVlODk4Zjc1YjMxNmExNTdlYjlkODU1YjUxZTMyYWVhYzc5ZDdjOGE4M2NlIgogICAgICB9LAogICAgICB7CiAgICAgICAgIm5hbWUiOiAiY29uZmlnLmpzb24iLAogICAgICAgICJhbGdvcml0aG0iOiAic2hhMjU2IiwKICAgICAgICAiZGlnZXN0IjogIjUzMTM4NzJmODExMDJlNjI2MTdlNjM4ZjhiZTU0MWVlNjhmMTRkZWQxYWU0ODRhODkyNjQ1YjVjNWExMGFlYTkiCiAgICAgIH0sCiAgICAgIHsKICAgICAgICAibmFtZSI6ICJnZW5lcmF0aW9uX2NvbmZpZy5qc29uIiwKICAgICAgICAiYWxnb3JpdGhtIjogInNoYTI1NiIsCiAgICAgICAgImRpZ2VzdCI6ICI3YzA0Y2I5ZDJiYTc3MWY3NTI4ZmJhNWE3MTA0OTk5Y2RhZjc1NjZkMDJiNWZiZDU4NDcyODI5ZjYyNzE2MTc3IgogICAgICB9LAogICAgICB7CiAgICAgICAgIm5hbWUiOiAibWVyZ2VzLnR4dCIsCiAgICAgICAgImFsZ29yaXRobSI6ICJzaGEyNTYiLAogICAgICAgICJkaWdlc3QiOiAiYjZmZTQyNGUzMzQ5MDNmN2ZiODRkM2ExMDZkOTczMDQ1NWY0NzQ0YjlmZTNjMjFlZTEzNmQ5N2EwMGU3MjUwMiIKICAgICAgfSwKICAgICAgewogICAgICAgICJuYW1lIjogIm1vZGVsLnNhZmV0ZW5zb3JzIiwKICAgICAgICAiYWxnb3JpdGhtIjogInNoYTI1NiIsCiAgICAgICAgImRpZ2VzdCI6ICJjYzJkZWY5YTk0ZWZkM2ExOGIyMTA4MGIwNTNmYTY0YTNmMGMzMjllZWYzNTcyMzNmMDM5Mjk0MDE3OTA1OGY0IgogICAgICB9LAogICAgICB7CiAgICAgICAgIm5hbWUiOiAic3BlY2lhbF90b2tlbnNfbWFwLmpzb24iLAogICAgICAgICJhbGdvcml0aG0iOiAic2hhMjU2IiwKICAgICAgICAiZGlnZXN0IjogImMwODY3NmM0OWZkNzk2OWEzMTMwZjcyYmU2ZDRiZjM0ZGE2NmFhNDg0YTZlMjFkZmZlMzU5ODkzYTFiZDVmMmUiCiAgICAgIH0sCiAgICAgIHsKICAgICAgICAibmFtZSI6ICJ0b2tlbml6ZXIuanNvbiIsCiAgICAgICAgImFsZ29yaXRobSI6ICJzaGEyNTYiLAogICAgICAgICJkaWdlc3QiOiAiZTJiYWQ2NjQzOTUzOGNiNGQ1YTc1ODA2ODA5MzI0MzJlZDllY2U5ZDNiODU3N2U2NzU1MTJiZGYxMTU5OTI1MyIKICAgICAgfSwKICAgICAgewogICAgICAgICJuYW1lIjogInRva2VuaXplcl9jb25maWcuanNvbiIsCiAgICAgICAgImFsZ29yaXRobSI6ICJzaGEyNTYiLAogICAgICAgICJkaWdlc3QiOiAiYTVlYzVkYWFiMTJiYTA5MGE5MGYzZGQxNjljOGY5YzI3NTU1NzAxM2E4N2I5YzEyNThkYzdjYjQ5N2EzNWM4NiIKICAgICAgfSwKICAgICAgewogICAgICAgICJuYW1lIjogInZvY2FiLmpzb24iLAogICAgICAgICJhbGdvcml0aG0iOiAic2hhMjU2IiwKICAgICAgICAiZGlnZXN0IjogIjhhZjcxMDc2ZGU4YjBiNjI2ZWVkMGY0Yzk4NGZhZjBhN2MwNjI0NzkxNjRiMmEzMTMwOGE5NDg1MjRkNGY2OWMiCiAgICAgIH0KICAgIF0KICB9Cn0=","payloadType":"application/vnd.in-toto+json","signatures":[{"sig":"MEYCIQD2ZNJ85mG8nD9aEIw7ZGGtN7A7/lNmV0OqetqO5twbOAIhAM2eVHqufpEBidtw+wEfEJaFAiTqwnHHIAo68/ixL3bi"}]}}
special_tokens_map.json ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<|end_of_text|>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "<|end_of_text|>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "<|pad|>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "unk_token": {
24
+ "content": "<|unk|>",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ }
30
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,783 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": false,
3
+ "add_prefix_space": false,
4
+ "added_tokens_decoder": {
5
+ "100256": {
6
+ "content": "<|pad|>",
7
+ "lstrip": false,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false,
11
+ "special": true
12
+ },
13
+ "100257": {
14
+ "content": "<|end_of_text|>",
15
+ "lstrip": false,
16
+ "normalized": false,
17
+ "rstrip": false,
18
+ "single_word": false,
19
+ "special": true
20
+ },
21
+ "100258": {
22
+ "content": "<|fim_prefix|>",
23
+ "lstrip": false,
24
+ "normalized": false,
25
+ "rstrip": false,
26
+ "single_word": false,
27
+ "special": false
28
+ },
29
+ "100259": {
30
+ "content": "<|fim_middle|>",
31
+ "lstrip": false,
32
+ "normalized": false,
33
+ "rstrip": false,
34
+ "single_word": false,
35
+ "special": false
36
+ },
37
+ "100260": {
38
+ "content": "<|fim_suffix|>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false,
43
+ "special": false
44
+ },
45
+ "100261": {
46
+ "content": "<|fim_pad|>",
47
+ "lstrip": false,
48
+ "normalized": false,
49
+ "rstrip": false,
50
+ "single_word": false,
51
+ "special": false
52
+ },
53
+ "100262": {
54
+ "content": "<|filename|>",
55
+ "lstrip": false,
56
+ "normalized": false,
57
+ "rstrip": false,
58
+ "single_word": false,
59
+ "special": false
60
+ },
61
+ "100263": {
62
+ "content": "<|reponame|>",
63
+ "lstrip": false,
64
+ "normalized": false,
65
+ "rstrip": false,
66
+ "single_word": false,
67
+ "special": false
68
+ },
69
+ "100264": {
70
+ "content": "<|start_of_role|>",
71
+ "lstrip": false,
72
+ "normalized": false,
73
+ "rstrip": false,
74
+ "single_word": false,
75
+ "special": true
76
+ },
77
+ "100265": {
78
+ "content": "<|end_of_role|>",
79
+ "lstrip": false,
80
+ "normalized": false,
81
+ "rstrip": false,
82
+ "single_word": false,
83
+ "special": true
84
+ },
85
+ "100266": {
86
+ "content": "<|unused_1|>",
87
+ "lstrip": false,
88
+ "normalized": false,
89
+ "rstrip": false,
90
+ "single_word": false,
91
+ "special": true
92
+ },
93
+ "100267": {
94
+ "content": "<|start_of_plugin|>",
95
+ "lstrip": false,
96
+ "normalized": false,
97
+ "rstrip": false,
98
+ "single_word": false,
99
+ "special": true
100
+ },
101
+ "100268": {
102
+ "content": "<|end_of_plugin|>",
103
+ "lstrip": false,
104
+ "normalized": false,
105
+ "rstrip": false,
106
+ "single_word": false,
107
+ "special": true
108
+ },
109
+ "100269": {
110
+ "content": "<|unk|>",
111
+ "lstrip": false,
112
+ "normalized": false,
113
+ "rstrip": false,
114
+ "single_word": false,
115
+ "special": true
116
+ },
117
+ "100270": {
118
+ "content": "<tool_call>",
119
+ "lstrip": false,
120
+ "normalized": false,
121
+ "rstrip": false,
122
+ "single_word": false,
123
+ "special": false
124
+ },
125
+ "100271": {
126
+ "content": "</tool_call>",
127
+ "lstrip": false,
128
+ "normalized": false,
129
+ "rstrip": false,
130
+ "single_word": false,
131
+ "special": false
132
+ },
133
+ "100272": {
134
+ "content": "<tool_response>",
135
+ "lstrip": false,
136
+ "normalized": false,
137
+ "rstrip": false,
138
+ "single_word": false,
139
+ "special": false
140
+ },
141
+ "100273": {
142
+ "content": "</tool_response>",
143
+ "lstrip": false,
144
+ "normalized": false,
145
+ "rstrip": false,
146
+ "single_word": false,
147
+ "special": false
148
+ },
149
+ "100274": {
150
+ "content": "<think>",
151
+ "lstrip": false,
152
+ "normalized": false,
153
+ "rstrip": false,
154
+ "single_word": false,
155
+ "special": false
156
+ },
157
+ "100275": {
158
+ "content": "</think>",
159
+ "lstrip": false,
160
+ "normalized": false,
161
+ "rstrip": false,
162
+ "single_word": false,
163
+ "special": false
164
+ },
165
+ "100276": {
166
+ "content": "<think_on>",
167
+ "lstrip": false,
168
+ "normalized": false,
169
+ "rstrip": false,
170
+ "single_word": false,
171
+ "special": true
172
+ },
173
+ "100277": {
174
+ "content": "<think_off>",
175
+ "lstrip": false,
176
+ "normalized": false,
177
+ "rstrip": false,
178
+ "single_word": false,
179
+ "special": true
180
+ },
181
+ "100278": {
182
+ "content": "<schema>",
183
+ "lstrip": false,
184
+ "normalized": false,
185
+ "rstrip": false,
186
+ "single_word": false,
187
+ "special": true
188
+ },
189
+ "100279": {
190
+ "content": "</schema>",
191
+ "lstrip": false,
192
+ "normalized": false,
193
+ "rstrip": false,
194
+ "single_word": false,
195
+ "special": true
196
+ },
197
+ "100280": {
198
+ "content": "<tools>",
199
+ "lstrip": false,
200
+ "normalized": false,
201
+ "rstrip": false,
202
+ "single_word": false,
203
+ "special": true
204
+ },
205
+ "100281": {
206
+ "content": "</tools>",
207
+ "lstrip": false,
208
+ "normalized": false,
209
+ "rstrip": false,
210
+ "single_word": false,
211
+ "special": true
212
+ },
213
+ "100282": {
214
+ "content": "<documents>",
215
+ "lstrip": false,
216
+ "normalized": false,
217
+ "rstrip": false,
218
+ "single_word": false,
219
+ "special": true
220
+ },
221
+ "100283": {
222
+ "content": "</documents>",
223
+ "lstrip": false,
224
+ "normalized": false,
225
+ "rstrip": false,
226
+ "single_word": false,
227
+ "special": true
228
+ },
229
+ "100284": {
230
+ "content": "<|unused_15|>",
231
+ "lstrip": false,
232
+ "normalized": false,
233
+ "rstrip": false,
234
+ "single_word": false,
235
+ "special": true
236
+ },
237
+ "100285": {
238
+ "content": "<|unused_16|>",
239
+ "lstrip": false,
240
+ "normalized": false,
241
+ "rstrip": false,
242
+ "single_word": false,
243
+ "special": true
244
+ },
245
+ "100286": {
246
+ "content": "<|unused_17|>",
247
+ "lstrip": false,
248
+ "normalized": false,
249
+ "rstrip": false,
250
+ "single_word": false,
251
+ "special": true
252
+ },
253
+ "100287": {
254
+ "content": "<|unused_18|>",
255
+ "lstrip": false,
256
+ "normalized": false,
257
+ "rstrip": false,
258
+ "single_word": false,
259
+ "special": true
260
+ },
261
+ "100288": {
262
+ "content": "<|unused_19|>",
263
+ "lstrip": false,
264
+ "normalized": false,
265
+ "rstrip": false,
266
+ "single_word": false,
267
+ "special": true
268
+ },
269
+ "100289": {
270
+ "content": "<|unused_20|>",
271
+ "lstrip": false,
272
+ "normalized": false,
273
+ "rstrip": false,
274
+ "single_word": false,
275
+ "special": true
276
+ },
277
+ "100290": {
278
+ "content": "<|unused_21|>",
279
+ "lstrip": false,
280
+ "normalized": false,
281
+ "rstrip": false,
282
+ "single_word": false,
283
+ "special": true
284
+ },
285
+ "100291": {
286
+ "content": "<|unused_22|>",
287
+ "lstrip": false,
288
+ "normalized": false,
289
+ "rstrip": false,
290
+ "single_word": false,
291
+ "special": true
292
+ },
293
+ "100292": {
294
+ "content": "<|unused_23|>",
295
+ "lstrip": false,
296
+ "normalized": false,
297
+ "rstrip": false,
298
+ "single_word": false,
299
+ "special": true
300
+ },
301
+ "100293": {
302
+ "content": "<|unused_24|>",
303
+ "lstrip": false,
304
+ "normalized": false,
305
+ "rstrip": false,
306
+ "single_word": false,
307
+ "special": true
308
+ },
309
+ "100294": {
310
+ "content": "<|unused_25|>",
311
+ "lstrip": false,
312
+ "normalized": false,
313
+ "rstrip": false,
314
+ "single_word": false,
315
+ "special": true
316
+ },
317
+ "100295": {
318
+ "content": "<|unused_26|>",
319
+ "lstrip": false,
320
+ "normalized": false,
321
+ "rstrip": false,
322
+ "single_word": false,
323
+ "special": true
324
+ },
325
+ "100296": {
326
+ "content": "<|unused_27|>",
327
+ "lstrip": false,
328
+ "normalized": false,
329
+ "rstrip": false,
330
+ "single_word": false,
331
+ "special": true
332
+ },
333
+ "100297": {
334
+ "content": "<|unused_28|>",
335
+ "lstrip": false,
336
+ "normalized": false,
337
+ "rstrip": false,
338
+ "single_word": false,
339
+ "special": true
340
+ },
341
+ "100298": {
342
+ "content": "<|unused_29|>",
343
+ "lstrip": false,
344
+ "normalized": false,
345
+ "rstrip": false,
346
+ "single_word": false,
347
+ "special": true
348
+ },
349
+ "100299": {
350
+ "content": "<|unused_30|>",
351
+ "lstrip": false,
352
+ "normalized": false,
353
+ "rstrip": false,
354
+ "single_word": false,
355
+ "special": true
356
+ },
357
+ "100300": {
358
+ "content": "<|unused_31|>",
359
+ "lstrip": false,
360
+ "normalized": false,
361
+ "rstrip": false,
362
+ "single_word": false,
363
+ "special": true
364
+ },
365
+ "100301": {
366
+ "content": "<|unused_32|>",
367
+ "lstrip": false,
368
+ "normalized": false,
369
+ "rstrip": false,
370
+ "single_word": false,
371
+ "special": true
372
+ },
373
+ "100302": {
374
+ "content": "<|unused_33|>",
375
+ "lstrip": false,
376
+ "normalized": false,
377
+ "rstrip": false,
378
+ "single_word": false,
379
+ "special": true
380
+ },
381
+ "100303": {
382
+ "content": "<|unused_34|>",
383
+ "lstrip": false,
384
+ "normalized": false,
385
+ "rstrip": false,
386
+ "single_word": false,
387
+ "special": true
388
+ },
389
+ "100304": {
390
+ "content": "<|unused_35|>",
391
+ "lstrip": false,
392
+ "normalized": false,
393
+ "rstrip": false,
394
+ "single_word": false,
395
+ "special": true
396
+ },
397
+ "100305": {
398
+ "content": "<|unused_36|>",
399
+ "lstrip": false,
400
+ "normalized": false,
401
+ "rstrip": false,
402
+ "single_word": false,
403
+ "special": true
404
+ },
405
+ "100306": {
406
+ "content": "<|unused_37|>",
407
+ "lstrip": false,
408
+ "normalized": false,
409
+ "rstrip": false,
410
+ "single_word": false,
411
+ "special": true
412
+ },
413
+ "100307": {
414
+ "content": "<|unused_38|>",
415
+ "lstrip": false,
416
+ "normalized": false,
417
+ "rstrip": false,
418
+ "single_word": false,
419
+ "special": true
420
+ },
421
+ "100308": {
422
+ "content": "<|unused_39|>",
423
+ "lstrip": false,
424
+ "normalized": false,
425
+ "rstrip": false,
426
+ "single_word": false,
427
+ "special": true
428
+ },
429
+ "100309": {
430
+ "content": "<|unused_40|>",
431
+ "lstrip": false,
432
+ "normalized": false,
433
+ "rstrip": false,
434
+ "single_word": false,
435
+ "special": true
436
+ },
437
+ "100310": {
438
+ "content": "<|unused_41|>",
439
+ "lstrip": false,
440
+ "normalized": false,
441
+ "rstrip": false,
442
+ "single_word": false,
443
+ "special": true
444
+ },
445
+ "100311": {
446
+ "content": "<|unused_42|>",
447
+ "lstrip": false,
448
+ "normalized": false,
449
+ "rstrip": false,
450
+ "single_word": false,
451
+ "special": true
452
+ },
453
+ "100312": {
454
+ "content": "<|unused_43|>",
455
+ "lstrip": false,
456
+ "normalized": false,
457
+ "rstrip": false,
458
+ "single_word": false,
459
+ "special": true
460
+ },
461
+ "100313": {
462
+ "content": "<|unused_44|>",
463
+ "lstrip": false,
464
+ "normalized": false,
465
+ "rstrip": false,
466
+ "single_word": false,
467
+ "special": true
468
+ },
469
+ "100314": {
470
+ "content": "<|unused_45|>",
471
+ "lstrip": false,
472
+ "normalized": false,
473
+ "rstrip": false,
474
+ "single_word": false,
475
+ "special": true
476
+ },
477
+ "100315": {
478
+ "content": "<|unused_46|>",
479
+ "lstrip": false,
480
+ "normalized": false,
481
+ "rstrip": false,
482
+ "single_word": false,
483
+ "special": true
484
+ },
485
+ "100316": {
486
+ "content": "<|unused_47|>",
487
+ "lstrip": false,
488
+ "normalized": false,
489
+ "rstrip": false,
490
+ "single_word": false,
491
+ "special": true
492
+ },
493
+ "100317": {
494
+ "content": "<|unused_48|>",
495
+ "lstrip": false,
496
+ "normalized": false,
497
+ "rstrip": false,
498
+ "single_word": false,
499
+ "special": true
500
+ },
501
+ "100318": {
502
+ "content": "<|unused_49|>",
503
+ "lstrip": false,
504
+ "normalized": false,
505
+ "rstrip": false,
506
+ "single_word": false,
507
+ "special": true
508
+ },
509
+ "100319": {
510
+ "content": "<|unused_50|>",
511
+ "lstrip": false,
512
+ "normalized": false,
513
+ "rstrip": false,
514
+ "single_word": false,
515
+ "special": true
516
+ },
517
+ "100320": {
518
+ "content": "<|unused_51|>",
519
+ "lstrip": false,
520
+ "normalized": false,
521
+ "rstrip": false,
522
+ "single_word": false,
523
+ "special": true
524
+ },
525
+ "100321": {
526
+ "content": "<|unused_52|>",
527
+ "lstrip": false,
528
+ "normalized": false,
529
+ "rstrip": false,
530
+ "single_word": false,
531
+ "special": true
532
+ },
533
+ "100322": {
534
+ "content": "<|unused_53|>",
535
+ "lstrip": false,
536
+ "normalized": false,
537
+ "rstrip": false,
538
+ "single_word": false,
539
+ "special": true
540
+ },
541
+ "100323": {
542
+ "content": "<|unused_54|>",
543
+ "lstrip": false,
544
+ "normalized": false,
545
+ "rstrip": false,
546
+ "single_word": false,
547
+ "special": true
548
+ },
549
+ "100324": {
550
+ "content": "<|unused_55|>",
551
+ "lstrip": false,
552
+ "normalized": false,
553
+ "rstrip": false,
554
+ "single_word": false,
555
+ "special": true
556
+ },
557
+ "100325": {
558
+ "content": "<|unused_56|>",
559
+ "lstrip": false,
560
+ "normalized": false,
561
+ "rstrip": false,
562
+ "single_word": false,
563
+ "special": true
564
+ },
565
+ "100326": {
566
+ "content": "<|unused_57|>",
567
+ "lstrip": false,
568
+ "normalized": false,
569
+ "rstrip": false,
570
+ "single_word": false,
571
+ "special": true
572
+ },
573
+ "100327": {
574
+ "content": "<|unused_58|>",
575
+ "lstrip": false,
576
+ "normalized": false,
577
+ "rstrip": false,
578
+ "single_word": false,
579
+ "special": true
580
+ },
581
+ "100328": {
582
+ "content": "<|unused_59|>",
583
+ "lstrip": false,
584
+ "normalized": false,
585
+ "rstrip": false,
586
+ "single_word": false,
587
+ "special": true
588
+ },
589
+ "100329": {
590
+ "content": "<|unused_60|>",
591
+ "lstrip": false,
592
+ "normalized": false,
593
+ "rstrip": false,
594
+ "single_word": false,
595
+ "special": true
596
+ },
597
+ "100330": {
598
+ "content": "<|unused_61|>",
599
+ "lstrip": false,
600
+ "normalized": false,
601
+ "rstrip": false,
602
+ "single_word": false,
603
+ "special": true
604
+ },
605
+ "100331": {
606
+ "content": "<|unused_62|>",
607
+ "lstrip": false,
608
+ "normalized": false,
609
+ "rstrip": false,
610
+ "single_word": false,
611
+ "special": true
612
+ },
613
+ "100332": {
614
+ "content": "<|unused_63|>",
615
+ "lstrip": false,
616
+ "normalized": false,
617
+ "rstrip": false,
618
+ "single_word": false,
619
+ "special": true
620
+ },
621
+ "100333": {
622
+ "content": "<|unused_64|>",
623
+ "lstrip": false,
624
+ "normalized": false,
625
+ "rstrip": false,
626
+ "single_word": false,
627
+ "special": true
628
+ },
629
+ "100334": {
630
+ "content": "<|unused_65|>",
631
+ "lstrip": false,
632
+ "normalized": false,
633
+ "rstrip": false,
634
+ "single_word": false,
635
+ "special": true
636
+ },
637
+ "100335": {
638
+ "content": "<|unused_66|>",
639
+ "lstrip": false,
640
+ "normalized": false,
641
+ "rstrip": false,
642
+ "single_word": false,
643
+ "special": true
644
+ },
645
+ "100336": {
646
+ "content": "<|unused_67|>",
647
+ "lstrip": false,
648
+ "normalized": false,
649
+ "rstrip": false,
650
+ "single_word": false,
651
+ "special": true
652
+ },
653
+ "100337": {
654
+ "content": "<|unused_68|>",
655
+ "lstrip": false,
656
+ "normalized": false,
657
+ "rstrip": false,
658
+ "single_word": false,
659
+ "special": true
660
+ },
661
+ "100338": {
662
+ "content": "<|unused_69|>",
663
+ "lstrip": false,
664
+ "normalized": false,
665
+ "rstrip": false,
666
+ "single_word": false,
667
+ "special": true
668
+ },
669
+ "100339": {
670
+ "content": "<|unused_70|>",
671
+ "lstrip": false,
672
+ "normalized": false,
673
+ "rstrip": false,
674
+ "single_word": false,
675
+ "special": true
676
+ },
677
+ "100340": {
678
+ "content": "<|unused_71|>",
679
+ "lstrip": false,
680
+ "normalized": false,
681
+ "rstrip": false,
682
+ "single_word": false,
683
+ "special": true
684
+ },
685
+ "100341": {
686
+ "content": "<|unused_72|>",
687
+ "lstrip": false,
688
+ "normalized": false,
689
+ "rstrip": false,
690
+ "single_word": false,
691
+ "special": true
692
+ },
693
+ "100342": {
694
+ "content": "<|unused_73|>",
695
+ "lstrip": false,
696
+ "normalized": false,
697
+ "rstrip": false,
698
+ "single_word": false,
699
+ "special": true
700
+ },
701
+ "100343": {
702
+ "content": "<|unused_74|>",
703
+ "lstrip": false,
704
+ "normalized": false,
705
+ "rstrip": false,
706
+ "single_word": false,
707
+ "special": true
708
+ },
709
+ "100344": {
710
+ "content": "<|unused_75|>",
711
+ "lstrip": false,
712
+ "normalized": false,
713
+ "rstrip": false,
714
+ "single_word": false,
715
+ "special": true
716
+ },
717
+ "100345": {
718
+ "content": "<|unused_76|>",
719
+ "lstrip": false,
720
+ "normalized": false,
721
+ "rstrip": false,
722
+ "single_word": false,
723
+ "special": true
724
+ },
725
+ "100346": {
726
+ "content": "<|unused_77|>",
727
+ "lstrip": false,
728
+ "normalized": false,
729
+ "rstrip": false,
730
+ "single_word": false,
731
+ "special": true
732
+ },
733
+ "100347": {
734
+ "content": "<|unused_78|>",
735
+ "lstrip": false,
736
+ "normalized": false,
737
+ "rstrip": false,
738
+ "single_word": false,
739
+ "special": true
740
+ },
741
+ "100348": {
742
+ "content": "<|unused_79|>",
743
+ "lstrip": false,
744
+ "normalized": false,
745
+ "rstrip": false,
746
+ "single_word": false,
747
+ "special": true
748
+ },
749
+ "100349": {
750
+ "content": "<|unused_80|>",
751
+ "lstrip": false,
752
+ "normalized": false,
753
+ "rstrip": false,
754
+ "single_word": false,
755
+ "special": true
756
+ },
757
+ "100350": {
758
+ "content": "<|unused_81|>",
759
+ "lstrip": false,
760
+ "normalized": false,
761
+ "rstrip": false,
762
+ "single_word": false,
763
+ "special": true
764
+ },
765
+ "100351": {
766
+ "content": "<|unused_82|>",
767
+ "lstrip": false,
768
+ "normalized": false,
769
+ "rstrip": false,
770
+ "single_word": false,
771
+ "special": true
772
+ }
773
+ },
774
+ "bos_token": "<|end_of_text|>",
775
+ "clean_up_tokenization_spaces": false,
776
+ "eos_token": "<|end_of_text|>",
777
+ "extra_special_tokens": {},
778
+ "model_max_length": 1000000000000000019884624838656,
779
+ "pad_token": "<|pad|>",
780
+ "padding_side": "left",
781
+ "tokenizer_class": "GPT2Tokenizer",
782
+ "unk_token": "<|unk|>"
783
+ }
vocab.json ADDED
The diff for this file is too large to render. See raw diff