prince-canuma patrickvonplaten commited on
Commit
25787e1
·
verified ·
0 Parent(s):

Duplicate from mistralai/Mistral-Small-3.1-24B-Instruct-2503

Browse files

Co-authored-by: Patrick von Platen <[email protected]>

.gitattributes ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tekken.json filter=lfs diff=lfs merge=lfs -text
37
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,431 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ - fr
5
+ - de
6
+ - es
7
+ - pt
8
+ - it
9
+ - ja
10
+ - ko
11
+ - ru
12
+ - zh
13
+ - ar
14
+ - fa
15
+ - id
16
+ - ms
17
+ - ne
18
+ - pl
19
+ - ro
20
+ - sr
21
+ - sv
22
+ - tr
23
+ - uk
24
+ - vi
25
+ - hi
26
+ - bn
27
+ license: apache-2.0
28
+ library_name: vllm
29
+ inference: false
30
+ base_model:
31
+ - mistralai/Mistral-Small-3.1-24B-Base-2503
32
+ extra_gated_description: If you want to learn more about how we process your personal
33
+ data, please read our <a href="https://mistral.ai/terms/">Privacy Policy</a>.
34
+ ---
35
+
36
+ # Model Card for Mistral-Small-3.1-24B-Instruct-2503
37
+
38
+ Building upon Mistral Small 3 (2501), Mistral Small 3.1 (2503) **adds state-of-the-art vision understanding** and enhances **long context capabilities up to 128k tokens** without compromising text performance.
39
+ With 24 billion parameters, this model achieves top-tier capabilities in both text and vision tasks.
40
+ This model is an instruction-finetuned version of: [Mistral-Small-3.1-24B-Base-2503](https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Base-2503).
41
+
42
+ Mistral Small 3.1 can be deployed locally and is exceptionally "knowledge-dense," fitting within a single RTX 4090 or a 32GB RAM MacBook once quantized.
43
+
44
+ It is ideal for:
45
+ - Fast-response conversational agents.
46
+ - Low-latency function calling.
47
+ - Subject matter experts via fine-tuning.
48
+ - Local inference for hobbyists and organizations handling sensitive data.
49
+ - Programming and math reasoning.
50
+ - Long document understanding.
51
+ - Visual understanding.
52
+
53
+ For enterprises requiring specialized capabilities (increased context, specific modalities, domain-specific knowledge, etc.), we will release commercial models beyond what Mistral AI contributes to the community.
54
+
55
+ Learn more about Mistral Small 3.1 in our [blog post](https://mistral.ai/news/mistral-small-3-1/).
56
+
57
+ ## Key Features
58
+ - **Vision:** Vision capabilities enable the model to analyze images and provide insights based on visual content in addition to text.
59
+ - **Multilingual:** Supports dozens of languages, including English, French, German, Greek, Hindi, Indonesian, Italian, Japanese, Korean, Malay, Nepali, Polish, Portuguese, Romanian, Russian, Serbian, Spanish, Swedish, Turkish, Ukrainian, Vietnamese, Arabic, Bengali, Chinese, Farsi.
60
+ - **Agent-Centric:** Offers best-in-class agentic capabilities with native function calling and JSON outputting.
61
+ - **Advanced Reasoning:** State-of-the-art conversational and reasoning capabilities.
62
+ - **Apache 2.0 License:** Open license allowing usage and modification for both commercial and non-commercial purposes.
63
+ - **Context Window:** A 128k context window.
64
+ - **System Prompt:** Maintains strong adherence and support for system prompts.
65
+ - **Tokenizer:** Utilizes a Tekken tokenizer with a 131k vocabulary size.
66
+
67
+ ## Benchmark Results
68
+
69
+ When available, we report numbers previously published by other model providers, otherwise we re-evaluate them using our own evaluation harness.
70
+
71
+ ### Pretrain Evals
72
+
73
+ | Model | MMLU (5-shot) | MMLU Pro (5-shot CoT) | TriviaQA | GPQA Main (5-shot CoT)| MMMU |
74
+ |--------------------------------|---------------|-----------------------|------------|-----------------------|-----------|
75
+ | **Small 3.1 24B Base** | **81.01%** | **56.03%** | 80.50% | **37.50%** | **59.27%**|
76
+ | Gemma 3 27B PT | 78.60% | 52.20% | **81.30%** | 24.30% | 56.10% |
77
+
78
+ ### Instruction Evals
79
+
80
+ #### Text
81
+
82
+ | Model | MMLU | MMLU Pro (5-shot CoT) | MATH | GPQA Main (5-shot CoT) | GPQA Diamond (5-shot CoT )| MBPP | HumanEval | SimpleQA (TotalAcc)|
83
+ |--------------------------------|-----------|-----------------------|------------------------|------------------------|---------------------------|-----------|-----------|--------------------|
84
+ | **Small 3.1 24B Instruct** | 80.62% | 66.76% | 69.30% | **44.42%** | **45.96%** | 74.71% | **88.41%**| **10.43%** |
85
+ | Gemma 3 27B IT | 76.90% | **67.50%** | **89.00%** | 36.83% | 42.40% | 74.40% | 87.80% | 10.00% |
86
+ | GPT4o Mini | **82.00%**| 61.70% | 70.20% | 40.20% | 39.39% | 84.82% | 87.20% | 9.50% |
87
+ | Claude 3.5 Haiku | 77.60% | 65.00% | 69.20% | 37.05% | 41.60% | **85.60%**| 88.10% | 8.02% |
88
+ | Cohere Aya-Vision 32B | 72.14% | 47.16% | 41.98% | 34.38% | 33.84% | 70.43% | 62.20% | 7.65% |
89
+
90
+ #### Vision
91
+
92
+ | Model | MMMU | MMMU PRO | Mathvista | ChartQA | DocVQA | AI2D | MM MT Bench |
93
+ |--------------------------------|------------|-----------|-----------|-----------|-----------|-------------|-------------|
94
+ | **Small 3.1 24B Instruct** | 64.00% | **49.25%**| **68.91%**| 86.24% | **94.08%**| **93.72%** | **7.3** |
95
+ | Gemma 3 27B IT | **64.90%** | 48.38% | 67.60% | 76.00% | 86.60% | 84.50% | 7 |
96
+ | GPT4o Mini | 59.40% | 37.60% | 56.70% | 76.80% | 86.70% | 88.10% | 6.6 |
97
+ | Claude 3.5 Haiku | 60.50% | 45.03% | 61.60% | **87.20%**| 90.00% | 92.10% | 6.5 |
98
+ | Cohere Aya-Vision 32B | 48.20% | 31.50% | 50.10% | 63.04% | 72.40% | 82.57% | 4.1 |
99
+
100
+ ### Multilingual Evals
101
+
102
+ | Model | Average | European | East Asian | Middle Eastern |
103
+ |--------------------------------|------------|------------|------------|----------------|
104
+ | **Small 3.1 24B Instruct** | **71.18%** | **75.30%** | **69.17%** | 69.08% |
105
+ | Gemma 3 27B IT | 70.19% | 74.14% | 65.65% | 70.76% |
106
+ | GPT4o Mini | 70.36% | 74.21% | 65.96% | **70.90%** |
107
+ | Claude 3.5 Haiku | 70.16% | 73.45% | 67.05% | 70.00% |
108
+ | Cohere Aya-Vision 32B | 62.15% | 64.70% | 57.61% | 64.12% |
109
+
110
+ ### Long Context Evals
111
+
112
+ | Model | LongBench v2 | RULER 32K | RULER 128K |
113
+ |--------------------------------|-----------------|-------------|------------|
114
+ | **Small 3.1 24B Instruct** | **37.18%** | **93.96%** | 81.20% |
115
+ | Gemma 3 27B IT | 34.59% | 91.10% | 66.00% |
116
+ | GPT4o Mini | 29.30% | 90.20% | 65.8% |
117
+ | Claude 3.5 Haiku | 35.19% | 92.60% | **91.90%** |
118
+
119
+ ## Basic Instruct Template (V7-Tekken)
120
+
121
+ ```
122
+ <s>[SYSTEM_PROMPT]<system prompt>[/SYSTEM_PROMPT][INST]<user message>[/INST]<assistant response></s>[INST]<user message>[/INST]
123
+ ```
124
+ *`<system_prompt>`, `<user message>` and `<assistant response>` are placeholders.*
125
+
126
+ ***Please make sure to use [mistral-common](https://github.com/mistralai/mistral-common) as the source of truth***
127
+
128
+ ## Usage
129
+
130
+ The model can be used with the following frameworks;
131
+ - [`vllm (recommended)`](https://github.com/vllm-project/vllm): See [here](#vllm)
132
+
133
+ **Note 1**: We recommend using a relatively low temperature, such as `temperature=0.15`.
134
+
135
+ **Note 2**: Make sure to add a system prompt to the model to best tailer it for your needs. If you want to use the model as a general assistant, we recommend the following
136
+ system prompt:
137
+
138
+ ```
139
+ system_prompt = """You are Mistral Small 3.1, a Large Language Model (LLM) created by Mistral AI, a French startup headquartered in Paris.
140
+ You power an AI assistant called Le Chat.
141
+ Your knowledge base was last updated on 2023-10-01.
142
+ The current date is {today}.
143
+
144
+ When you're not sure about some information, you say that you don't have the information and don't make up anything.
145
+ If the user's question is not clear, ambiguous, or does not provide enough context for you to accurately answer the question, you do not try to answer it right away and you rather ask the user to clarify their request (e.g. "What are some good restaurants around me?" => "Where are you?" or "When is the next flight to Tokyo" => "Where do you travel from?").
146
+ You are always very attentive to dates, in particular you try to resolve dates (e.g. "yesterday" is {yesterday}) and when asked about information at specific dates, you discard information that is at another date.
147
+ You follow these instructions in all languages, and always respond to the user in the language they use or request.
148
+ Next sections describe the capabilities that you have.
149
+
150
+ # WEB BROWSING INSTRUCTIONS
151
+
152
+ You cannot perform any web search or access internet to open URLs, links etc. If it seems like the user is expecting you to do so, you clarify the situation and ask the user to copy paste the text directly in the chat.
153
+
154
+ # MULTI-MODAL INSTRUCTIONS
155
+
156
+ You have the ability to read images, but you cannot generate images. You also cannot transcribe audio files or videos.
157
+ You cannot read nor transcribe audio files or videos."""
158
+ ```
159
+
160
+ ### vLLM (recommended)
161
+
162
+ We recommend using this model with the [vLLM library](https://github.com/vllm-project/vllm)
163
+ to implement production-ready inference pipelines.
164
+
165
+ **_Installation_**
166
+
167
+ Make sure you install [`vLLM nightly`](https://github.com/vllm-project/vllm/):
168
+
169
+ ```
170
+ pip install vllm --pre --extra-index-url https://wheels.vllm.ai/nightly --upgrade
171
+ ```
172
+
173
+ Doing so should automatically install [`mistral_common >= 1.5.4`](https://github.com/mistralai/mistral-common/releases/tag/v1.5.4).
174
+
175
+ To check:
176
+ ```
177
+ python -c "import mistral_common; print(mistral_common.__version__)"
178
+ ```
179
+
180
+ You can also make use of a ready-to-go [docker image](https://github.com/vllm-project/vllm/blob/main/Dockerfile) or on the [docker hub](https://hub.docker.com/layers/vllm/vllm-openai/latest/images/sha256-de9032a92ffea7b5c007dad80b38fd44aac11eddc31c435f8e52f3b7404bbf39) followed by a nightly install of vllm as shown above.
181
+
182
+ #### Server
183
+
184
+ We recommand that you use Mistral-Small-3.1-24B-Instruct-2503 in a server/client setting.
185
+
186
+ 1. Spin up a server:
187
+
188
+ ```
189
+ vllm serve mistralai/Mistral-Small-3.1-24B-Instruct-2503 --tokenizer_mode mistral --config_format mistral --load_format mistral --tool-call-parser mistral --enable-auto-tool-choice --limit_mm_per_prompt 'image=10' --tensor-parallel-size 2
190
+ ```
191
+
192
+ **Note:** Running Mistral-Small-3.1-24B-Instruct-2503 on GPU requires ~55 GB of GPU RAM in bf16 or fp16.
193
+
194
+
195
+ 2. To ping the client you can use a simple Python snippet.
196
+
197
+ ```py
198
+ import requests
199
+ import json
200
+ from huggingface_hub import hf_hub_download
201
+ from datetime import datetime, timedelta
202
+
203
+ url = "http://<your-server-url>:8000/v1/chat/completions"
204
+ headers = {"Content-Type": "application/json", "Authorization": "Bearer token"}
205
+
206
+ model = "mistralai/Mistral-Small-3.1-24B-Instruct-2503"
207
+
208
+
209
+ def load_system_prompt(repo_id: str, filename: str) -> str:
210
+ file_path = hf_hub_download(repo_id=repo_id, filename=filename)
211
+ with open(file_path, "r") as file:
212
+ system_prompt = file.read()
213
+ today = datetime.today().strftime("%Y-%m-%d")
214
+ yesterday = (datetime.today() - timedelta(days=1)).strftime("%Y-%m-%d")
215
+ model_name = repo_id.split("/")[-1]
216
+ return system_prompt.format(name=model_name, today=today, yesterday=yesterday)
217
+
218
+
219
+ SYSTEM_PROMPT = load_system_prompt(model, "SYSTEM_PROMPT.txt")
220
+
221
+ image_url = "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/europe.png"
222
+
223
+ messages = [
224
+ {"role": "system", "content": SYSTEM_PROMPT},
225
+ {
226
+ "role": "user",
227
+ "content": [
228
+ {
229
+ "type": "text",
230
+ "text": "Which of the depicted countries has the best food? Which the second and third and fourth? Name the country, its color on the map and one its city that is visible on the map, but is not the capital. Make absolutely sure to only name a city that can be seen on the map.",
231
+ },
232
+ {"type": "image_url", "image_url": {"url": image_url}},
233
+ ],
234
+ },
235
+ ]
236
+
237
+ data = {"model": model, "messages": messages, "temperature": 0.15}
238
+
239
+ response = requests.post(url, headers=headers, data=json.dumps(data))
240
+ print(response.json()["choices"][0]["message"]["content"])
241
+ # Determining the "best" food is highly subjective and depends on personal preferences. However, based on general popularity and recognition, here are some countries known for their cuisine:
242
+
243
+ # 1. **Italy** - Color: Light Green - City: Milan
244
+ # - Italian cuisine is renowned worldwide for its pasta, pizza, and various regional specialties.
245
+
246
+ # 2. **France** - Color: Brown - City: Lyon
247
+ # - French cuisine is celebrated for its sophistication, including dishes like coq au vin, bouillabaisse, and pastries like croissants and éclairs.
248
+
249
+ # 3. **Spain** - Color: Yellow - City: Bilbao
250
+ # - Spanish cuisine offers a variety of flavors, from paella and tapas to jamón ibérico and churros.
251
+
252
+ # 4. **Greece** - Not visible on the map
253
+ # - Greek cuisine is known for dishes like moussaka, souvlaki, and baklava. Unfortunately, Greece is not visible on the provided map, so I cannot name a city.
254
+
255
+ # Since Greece is not visible on the map, I'll replace it with another country known for its good food:
256
+
257
+ # 4. **Turkey** - Color: Light Green (east part of the map) - City: Istanbul
258
+ # - Turkish cuisine is diverse and includes dishes like kebabs, meze, and baklava.
259
+ ```
260
+
261
+ ### Function calling
262
+
263
+ Mistral-Small-3.1-24-Instruct-2503 is excellent at function / tool calling tasks via vLLM. *E.g.:*
264
+
265
+ <details>
266
+ <summary>Example</summary>
267
+
268
+ ```py
269
+ import requests
270
+ import json
271
+ from huggingface_hub import hf_hub_download
272
+ from datetime import datetime, timedelta
273
+
274
+ url = "http://<your-url>:8000/v1/chat/completions"
275
+ headers = {"Content-Type": "application/json", "Authorization": "Bearer token"}
276
+
277
+ model = "mistralai/Mistral-Small-3.1-24B-Instruct-2503"
278
+
279
+
280
+ def load_system_prompt(repo_id: str, filename: str) -> str:
281
+ file_path = hf_hub_download(repo_id=repo_id, filename=filename)
282
+ with open(file_path, "r") as file:
283
+ system_prompt = file.read()
284
+ today = datetime.today().strftime("%Y-%m-%d")
285
+ yesterday = (datetime.today() - timedelta(days=1)).strftime("%Y-%m-%d")
286
+ model_name = repo_id.split("/")[-1]
287
+ return system_prompt.format(name=model_name, today=today, yesterday=yesterday)
288
+
289
+
290
+ SYSTEM_PROMPT = load_system_prompt(model, "SYSTEM_PROMPT.txt")
291
+
292
+
293
+ tools = [
294
+ {
295
+ "type": "function",
296
+ "function": {
297
+ "name": "get_current_weather",
298
+ "description": "Get the current weather in a given location",
299
+ "parameters": {
300
+ "type": "object",
301
+ "properties": {
302
+ "city": {
303
+ "type": "string",
304
+ "description": "The city to find the weather for, e.g. 'San Francisco'",
305
+ },
306
+ "state": {
307
+ "type": "string",
308
+ "description": "The state abbreviation, e.g. 'CA' for California",
309
+ },
310
+ "unit": {
311
+ "type": "string",
312
+ "description": "The unit for temperature",
313
+ "enum": ["celsius", "fahrenheit"],
314
+ },
315
+ },
316
+ "required": ["city", "state", "unit"],
317
+ },
318
+ },
319
+ },
320
+ {
321
+ "type": "function",
322
+ "function": {
323
+ "name": "rewrite",
324
+ "description": "Rewrite a given text for improved clarity",
325
+ "parameters": {
326
+ "type": "object",
327
+ "properties": {
328
+ "text": {
329
+ "type": "string",
330
+ "description": "The input text to rewrite",
331
+ }
332
+ },
333
+ },
334
+ },
335
+ },
336
+ ]
337
+
338
+ messages = [
339
+ {"role": "system", "content": SYSTEM_PROMPT},
340
+ {
341
+ "role": "user",
342
+ "content": "Could you please make the below article more concise?\n\nOpenAI is an artificial intelligence research laboratory consisting of the non-profit OpenAI Incorporated and its for-profit subsidiary corporation OpenAI Limited Partnership.",
343
+ },
344
+ {
345
+ "role": "assistant",
346
+ "content": "",
347
+ "tool_calls": [
348
+ {
349
+ "id": "bbc5b7ede",
350
+ "type": "function",
351
+ "function": {
352
+ "name": "rewrite",
353
+ "arguments": '{"text": "OpenAI is an artificial intelligence research laboratory consisting of the non-profit OpenAI Incorporated and its for-profit subsidiary corporation OpenAI Limited Partnership."}',
354
+ },
355
+ }
356
+ ],
357
+ },
358
+ {
359
+ "role": "tool",
360
+ "content": '{"action":"rewrite","outcome":"OpenAI is a FOR-profit company."}',
361
+ "tool_call_id": "bbc5b7ede",
362
+ "name": "rewrite",
363
+ },
364
+ {
365
+ "role": "assistant",
366
+ "content": "---\n\nOpenAI is a FOR-profit company.",
367
+ },
368
+ {
369
+ "role": "user",
370
+ "content": "Can you tell me what the temperature will be in Dallas, in Fahrenheit?",
371
+ },
372
+ ]
373
+
374
+ data = {"model": model, "messages": messages, "tools": tools, "temperature": 0.15}
375
+
376
+ response = requests.post(url, headers=headers, data=json.dumps(data))
377
+ print(response.json()["choices"][0]["message"]["tool_calls"])
378
+ # [{'id': '8PdihwL6d', 'type': 'function', 'function': {'name': 'get_current_weather', 'arguments': '{"city": "Dallas", "state": "TX", "unit": "fahrenheit"}'}}]
379
+ ```
380
+
381
+ </details>
382
+
383
+ #### Offline
384
+
385
+ ```py
386
+ from vllm import LLM
387
+ from vllm.sampling_params import SamplingParams
388
+ from datetime import datetime, timedelta
389
+
390
+ SYSTEM_PROMPT = "You are a conversational agent that always answers straight to the point, always end your accurate response with an ASCII drawing of a cat."
391
+
392
+ user_prompt = "Give me 5 non-formal ways to say 'See you later' in French."
393
+
394
+ messages = [
395
+ {
396
+ "role": "system",
397
+ "content": SYSTEM_PROMPT
398
+ },
399
+ {
400
+ "role": "user",
401
+ "content": user_prompt
402
+ },
403
+ ]
404
+ model_name = "mistralai/Mistral-Small-3.1-24B-Instruct-2503"
405
+ # note that running this model on GPU requires over 60 GB of GPU RAM
406
+ llm = LLM(model=model_name, tokenizer_mode="mistral")
407
+
408
+ sampling_params = SamplingParams(max_tokens=512, temperature=0.15)
409
+ outputs = llm.chat(messages, sampling_params=sampling_params)
410
+
411
+ print(outputs[0].outputs[0].text)
412
+ # Here are five non-formal ways to say "See you later" in French:
413
+
414
+ # 1. **À plus tard** - Until later
415
+ # 2. **À toute** - See you soon (informal)
416
+ # 3. **Salut** - Bye (can also mean hi)
417
+ # 4. **À plus** - See you later (informal)
418
+ # 5. **Ciao** - Bye (informal, borrowed from Italian)
419
+
420
+ # ```
421
+ # /\_/\
422
+ # ( o.o )
423
+ # > ^ <
424
+ # ```
425
+ ```
426
+
427
+ ### Transformers (untested)
428
+
429
+ Transformers-compatible model weights are also uploaded (thanks a lot @cyrilvallez).
430
+ However the transformers implementation was **not throughly tested**, but only on "vibe-checks".
431
+ Hence, we can only ensure 100% correct behavior when using the original weight format with vllm (see above).
SYSTEM_PROMPT.txt ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ You are {name}, a Large Language Model (LLM) created by Mistral AI, a French startup headquartered in Paris.
2
+ You power an AI assistant called Le Chat.
3
+ Your knowledge base was last updated on 2023-10-01.
4
+ The current date is {today}.
5
+
6
+ When you're not sure about some information, you say that you don't have the information and don't make up anything.
7
+ If the user's question is not clear, ambiguous, or does not provide enough context for you to accurately answer the question, you do not try to answer it right away and you rather ask the user to clarify their request (e.g. "What are some good restaurants around me?" => "Where are you?" or "When is the next flight to Tokyo" => "Where do you travel from?").
8
+ You are always very attentive to dates, in particular you try to resolve dates (e.g. "yesterday" is {yesterday}) and when asked about information at specific dates, you discard information that is at another date.
9
+ You follow these instructions in all languages, and always respond to the user in the language they use or request.
10
+ Next sections describe the capabilities that you have.
11
+
12
+ # WEB BROWSING INSTRUCTIONS
13
+
14
+ You cannot perform any web search or access internet to open URLs, links etc. If it seems like the user is expecting you to do so, you clarify the situation and ask the user to copy paste the text directly in the chat.
15
+
16
+ # MULTI-MODAL INSTRUCTIONS
17
+
18
+ You have the ability to read images, but you cannot generate images. You also cannot transcribe audio files or videos.
19
+ You cannot read nor transcribe audio files or videos.
chat_template.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "chat_template": "{%- set today = strftime_now(\"%Y-%m-%d\") %}\n{%- set default_system_message = \"You are Mistral Small 3, a Large Language Model (LLM) created by Mistral AI, a French startup headquartered in Paris.\\nYour knowledge base was last updated on 2023-10-01. The current date is \" + today + \".\\n\\nWhen you're not sure about some information, you say that you don't have the information and don't make up anything.\\nIf the user's question is not clear, ambiguous, or does not provide enough context for you to accurately answer the question, you do not try to answer it right away and you rather ask the user to clarify their request (e.g. \\\"What are some good restaurants around me?\\\" => \\\"Where are you?\\\" or \\\"When is the next flight to Tokyo\\\" => \\\"Where do you travel from?\\\")\" %}\n\n{{- bos_token }}\n\n{%- if messages[0]['role'] == 'system' %}\n {%- set system_message = messages[0]['content'] %}\n {%- set loop_messages = messages[1:] %}\n{%- else %}\n {%- set system_message = default_system_message %}\n {%- set loop_messages = messages %}\n{%- endif %}\n{{- '[SYSTEM_PROMPT]' + system_message + '[/SYSTEM_PROMPT]' }}\n\n{%- for message in loop_messages %}\n {%- if message['role'] == 'user' %}\n\t {%- if message['content'] is string %}\n {{- '[INST]' + message['content'] + '[/INST]' }}\n\t {%- else %}\n\t\t {{- '[INST]' }}\n\t\t {%- for block in message['content'] %}\n\t\t\t {%- if block['type'] == 'text' %}\n\t\t\t\t {{- block['text'] }}\n\t\t\t {%- elif block['type'] == 'image' or block['type'] == 'image_url' %}\n\t\t\t\t {{- '[IMG]' }}\n\t\t\t\t{%- else %}\n\t\t\t\t {{- raise_exception('Only text and image blocks are supported in message content!') }}\n\t\t\t\t{%- endif %}\n\t\t\t{%- endfor %}\n\t\t {{- '[/INST]' }}\n\t\t{%- endif %}\n {%- elif message['role'] == 'system' %}\n {{- '[SYSTEM_PROMPT]' + message['content'] + '[/SYSTEM_PROMPT]' }}\n {%- elif message['role'] == 'assistant' %}\n {{- message['content'] + eos_token }}\n {%- else %}\n {{- raise_exception('Only user, system and assistant roles are supported!') }}\n {%- endif %}\n{%- endfor %}"
3
+ }
config.json ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "Mistral3ForConditionalGeneration"
4
+ ],
5
+ "image_token_index": 10,
6
+ "model_type": "mistral3",
7
+ "multimodal_projector_bias": false,
8
+ "projector_hidden_act": "gelu",
9
+ "spatial_merge_size": 2,
10
+ "text_config": {
11
+ "attention_dropout": 0.0,
12
+ "head_dim": 128,
13
+ "hidden_act": "silu",
14
+ "hidden_size": 5120,
15
+ "initializer_range": 0.02,
16
+ "intermediate_size": 32768,
17
+ "max_position_embeddings": 131072,
18
+ "model_type": "mistral",
19
+ "num_attention_heads": 32,
20
+ "num_hidden_layers": 40,
21
+ "num_key_value_heads": 8,
22
+ "rms_norm_eps": 1e-05,
23
+ "rope_theta": 1000000000.0,
24
+ "sliding_window": null,
25
+ "use_cache": true,
26
+ "vocab_size": 131072
27
+ },
28
+ "torch_dtype": "bfloat16",
29
+ "transformers_version": "4.50.0.dev0",
30
+ "vision_config": {
31
+ "attention_dropout": 0.0,
32
+ "head_dim": 64,
33
+ "hidden_act": "gelu",
34
+ "hidden_size": 1024,
35
+ "image_size": 1540,
36
+ "initializer_range": 0.02,
37
+ "intermediate_size": 4096,
38
+ "model_type": "pixtral",
39
+ "num_attention_heads": 16,
40
+ "num_channels": 3,
41
+ "num_hidden_layers": 24,
42
+ "patch_size": 14,
43
+ "rope_theta": 10000.0
44
+ },
45
+ "vision_feature_layer": -1
46
+ }
consolidated.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d446ca97599fa9d98b2e3744d8b83019837a2fe34a80f4353120b1e9b6249b1e
3
+ size 48022792280
generation_config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 1,
4
+ "eos_token_id": 2,
5
+ "transformers_version": "4.50.0.dev0"
6
+ }
model-00001-of-00010.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4d735fce2330e13fa96c0ad839fbf703481ab05680593ad32b9219be30c9ff92
3
+ size 4883550696
model-00002-of-00010.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:99c38a089b14a0fca228899132eb8db54ea551bdc96b639267d5177e4a172776
3
+ size 4781593336
model-00003-of-00010.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4917b19a128315c56c8ff3cdf531d9a88371314098c2698e28ac03a6d8c77813
3
+ size 4886472224
model-00004-of-00010.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:acc83eeee5e993095ca9ad7518bf29f511b9e13b6b57d4f53580e9b8b6c084fe
3
+ size 4781593376
model-00005-of-00010.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5d81e8c278b8af82f96a69f36ce71e743a70aa8fe28d3118bd8d24dae5001b62
3
+ size 4781593368
model-00006-of-00010.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8ca257e4f5ee8446536562de3e0b7e1e2614d23b22311bf29bb88385218ba1d9
3
+ size 4886472248
model-00007-of-00010.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f880e9065a25eac3e4986d9b4d45ae0225fe59e093731ad633fded7e91ce058f
3
+ size 4781593376
model-00008-of-00010.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3bc98dc787b471919b469862a9c67f39f9ab2a407e641614da858cd35cb65823
3
+ size 4781593368
model-00009-of-00010.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b1ad95f53916db9384b80f3a233fb524d3085136b9065fd39a1fdf4f1749c401
3
+ size 4886472248
model-00010-of-00010.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:97a62f323631157567b35651d7f835baaf290e5fc17c92bb888023cdb43218d1
3
+ size 4571866320
model.safetensors.index.json ADDED
@@ -0,0 +1,592 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "metadata": {
3
+ "total_size": 48022722560
4
+ },
5
+ "weight_map": {
6
+ "language_model.lm_head.weight": "model-00010-of-00010.safetensors",
7
+ "language_model.model.embed_tokens.weight": "model-00001-of-00010.safetensors",
8
+ "language_model.model.layers.0.input_layernorm.weight": "model-00001-of-00010.safetensors",
9
+ "language_model.model.layers.0.mlp.down_proj.weight": "model-00001-of-00010.safetensors",
10
+ "language_model.model.layers.0.mlp.gate_proj.weight": "model-00001-of-00010.safetensors",
11
+ "language_model.model.layers.0.mlp.up_proj.weight": "model-00001-of-00010.safetensors",
12
+ "language_model.model.layers.0.post_attention_layernorm.weight": "model-00001-of-00010.safetensors",
13
+ "language_model.model.layers.0.self_attn.k_proj.weight": "model-00001-of-00010.safetensors",
14
+ "language_model.model.layers.0.self_attn.o_proj.weight": "model-00001-of-00010.safetensors",
15
+ "language_model.model.layers.0.self_attn.q_proj.weight": "model-00001-of-00010.safetensors",
16
+ "language_model.model.layers.0.self_attn.v_proj.weight": "model-00001-of-00010.safetensors",
17
+ "language_model.model.layers.1.input_layernorm.weight": "model-00001-of-00010.safetensors",
18
+ "language_model.model.layers.1.mlp.down_proj.weight": "model-00001-of-00010.safetensors",
19
+ "language_model.model.layers.1.mlp.gate_proj.weight": "model-00001-of-00010.safetensors",
20
+ "language_model.model.layers.1.mlp.up_proj.weight": "model-00001-of-00010.safetensors",
21
+ "language_model.model.layers.1.post_attention_layernorm.weight": "model-00001-of-00010.safetensors",
22
+ "language_model.model.layers.1.self_attn.k_proj.weight": "model-00001-of-00010.safetensors",
23
+ "language_model.model.layers.1.self_attn.o_proj.weight": "model-00001-of-00010.safetensors",
24
+ "language_model.model.layers.1.self_attn.q_proj.weight": "model-00001-of-00010.safetensors",
25
+ "language_model.model.layers.1.self_attn.v_proj.weight": "model-00001-of-00010.safetensors",
26
+ "language_model.model.layers.10.input_layernorm.weight": "model-00003-of-00010.safetensors",
27
+ "language_model.model.layers.10.mlp.down_proj.weight": "model-00003-of-00010.safetensors",
28
+ "language_model.model.layers.10.mlp.gate_proj.weight": "model-00003-of-00010.safetensors",
29
+ "language_model.model.layers.10.mlp.up_proj.weight": "model-00003-of-00010.safetensors",
30
+ "language_model.model.layers.10.post_attention_layernorm.weight": "model-00003-of-00010.safetensors",
31
+ "language_model.model.layers.10.self_attn.k_proj.weight": "model-00003-of-00010.safetensors",
32
+ "language_model.model.layers.10.self_attn.o_proj.weight": "model-00003-of-00010.safetensors",
33
+ "language_model.model.layers.10.self_attn.q_proj.weight": "model-00003-of-00010.safetensors",
34
+ "language_model.model.layers.10.self_attn.v_proj.weight": "model-00003-of-00010.safetensors",
35
+ "language_model.model.layers.11.input_layernorm.weight": "model-00004-of-00010.safetensors",
36
+ "language_model.model.layers.11.mlp.down_proj.weight": "model-00004-of-00010.safetensors",
37
+ "language_model.model.layers.11.mlp.gate_proj.weight": "model-00004-of-00010.safetensors",
38
+ "language_model.model.layers.11.mlp.up_proj.weight": "model-00004-of-00010.safetensors",
39
+ "language_model.model.layers.11.post_attention_layernorm.weight": "model-00004-of-00010.safetensors",
40
+ "language_model.model.layers.11.self_attn.k_proj.weight": "model-00003-of-00010.safetensors",
41
+ "language_model.model.layers.11.self_attn.o_proj.weight": "model-00003-of-00010.safetensors",
42
+ "language_model.model.layers.11.self_attn.q_proj.weight": "model-00003-of-00010.safetensors",
43
+ "language_model.model.layers.11.self_attn.v_proj.weight": "model-00003-of-00010.safetensors",
44
+ "language_model.model.layers.12.input_layernorm.weight": "model-00004-of-00010.safetensors",
45
+ "language_model.model.layers.12.mlp.down_proj.weight": "model-00004-of-00010.safetensors",
46
+ "language_model.model.layers.12.mlp.gate_proj.weight": "model-00004-of-00010.safetensors",
47
+ "language_model.model.layers.12.mlp.up_proj.weight": "model-00004-of-00010.safetensors",
48
+ "language_model.model.layers.12.post_attention_layernorm.weight": "model-00004-of-00010.safetensors",
49
+ "language_model.model.layers.12.self_attn.k_proj.weight": "model-00004-of-00010.safetensors",
50
+ "language_model.model.layers.12.self_attn.o_proj.weight": "model-00004-of-00010.safetensors",
51
+ "language_model.model.layers.12.self_attn.q_proj.weight": "model-00004-of-00010.safetensors",
52
+ "language_model.model.layers.12.self_attn.v_proj.weight": "model-00004-of-00010.safetensors",
53
+ "language_model.model.layers.13.input_layernorm.weight": "model-00004-of-00010.safetensors",
54
+ "language_model.model.layers.13.mlp.down_proj.weight": "model-00004-of-00010.safetensors",
55
+ "language_model.model.layers.13.mlp.gate_proj.weight": "model-00004-of-00010.safetensors",
56
+ "language_model.model.layers.13.mlp.up_proj.weight": "model-00004-of-00010.safetensors",
57
+ "language_model.model.layers.13.post_attention_layernorm.weight": "model-00004-of-00010.safetensors",
58
+ "language_model.model.layers.13.self_attn.k_proj.weight": "model-00004-of-00010.safetensors",
59
+ "language_model.model.layers.13.self_attn.o_proj.weight": "model-00004-of-00010.safetensors",
60
+ "language_model.model.layers.13.self_attn.q_proj.weight": "model-00004-of-00010.safetensors",
61
+ "language_model.model.layers.13.self_attn.v_proj.weight": "model-00004-of-00010.safetensors",
62
+ "language_model.model.layers.14.input_layernorm.weight": "model-00004-of-00010.safetensors",
63
+ "language_model.model.layers.14.mlp.down_proj.weight": "model-00004-of-00010.safetensors",
64
+ "language_model.model.layers.14.mlp.gate_proj.weight": "model-00004-of-00010.safetensors",
65
+ "language_model.model.layers.14.mlp.up_proj.weight": "model-00004-of-00010.safetensors",
66
+ "language_model.model.layers.14.post_attention_layernorm.weight": "model-00004-of-00010.safetensors",
67
+ "language_model.model.layers.14.self_attn.k_proj.weight": "model-00004-of-00010.safetensors",
68
+ "language_model.model.layers.14.self_attn.o_proj.weight": "model-00004-of-00010.safetensors",
69
+ "language_model.model.layers.14.self_attn.q_proj.weight": "model-00004-of-00010.safetensors",
70
+ "language_model.model.layers.14.self_attn.v_proj.weight": "model-00004-of-00010.safetensors",
71
+ "language_model.model.layers.15.input_layernorm.weight": "model-00005-of-00010.safetensors",
72
+ "language_model.model.layers.15.mlp.down_proj.weight": "model-00005-of-00010.safetensors",
73
+ "language_model.model.layers.15.mlp.gate_proj.weight": "model-00004-of-00010.safetensors",
74
+ "language_model.model.layers.15.mlp.up_proj.weight": "model-00005-of-00010.safetensors",
75
+ "language_model.model.layers.15.post_attention_layernorm.weight": "model-00005-of-00010.safetensors",
76
+ "language_model.model.layers.15.self_attn.k_proj.weight": "model-00004-of-00010.safetensors",
77
+ "language_model.model.layers.15.self_attn.o_proj.weight": "model-00004-of-00010.safetensors",
78
+ "language_model.model.layers.15.self_attn.q_proj.weight": "model-00004-of-00010.safetensors",
79
+ "language_model.model.layers.15.self_attn.v_proj.weight": "model-00004-of-00010.safetensors",
80
+ "language_model.model.layers.16.input_layernorm.weight": "model-00005-of-00010.safetensors",
81
+ "language_model.model.layers.16.mlp.down_proj.weight": "model-00005-of-00010.safetensors",
82
+ "language_model.model.layers.16.mlp.gate_proj.weight": "model-00005-of-00010.safetensors",
83
+ "language_model.model.layers.16.mlp.up_proj.weight": "model-00005-of-00010.safetensors",
84
+ "language_model.model.layers.16.post_attention_layernorm.weight": "model-00005-of-00010.safetensors",
85
+ "language_model.model.layers.16.self_attn.k_proj.weight": "model-00005-of-00010.safetensors",
86
+ "language_model.model.layers.16.self_attn.o_proj.weight": "model-00005-of-00010.safetensors",
87
+ "language_model.model.layers.16.self_attn.q_proj.weight": "model-00005-of-00010.safetensors",
88
+ "language_model.model.layers.16.self_attn.v_proj.weight": "model-00005-of-00010.safetensors",
89
+ "language_model.model.layers.17.input_layernorm.weight": "model-00005-of-00010.safetensors",
90
+ "language_model.model.layers.17.mlp.down_proj.weight": "model-00005-of-00010.safetensors",
91
+ "language_model.model.layers.17.mlp.gate_proj.weight": "model-00005-of-00010.safetensors",
92
+ "language_model.model.layers.17.mlp.up_proj.weight": "model-00005-of-00010.safetensors",
93
+ "language_model.model.layers.17.post_attention_layernorm.weight": "model-00005-of-00010.safetensors",
94
+ "language_model.model.layers.17.self_attn.k_proj.weight": "model-00005-of-00010.safetensors",
95
+ "language_model.model.layers.17.self_attn.o_proj.weight": "model-00005-of-00010.safetensors",
96
+ "language_model.model.layers.17.self_attn.q_proj.weight": "model-00005-of-00010.safetensors",
97
+ "language_model.model.layers.17.self_attn.v_proj.weight": "model-00005-of-00010.safetensors",
98
+ "language_model.model.layers.18.input_layernorm.weight": "model-00005-of-00010.safetensors",
99
+ "language_model.model.layers.18.mlp.down_proj.weight": "model-00005-of-00010.safetensors",
100
+ "language_model.model.layers.18.mlp.gate_proj.weight": "model-00005-of-00010.safetensors",
101
+ "language_model.model.layers.18.mlp.up_proj.weight": "model-00005-of-00010.safetensors",
102
+ "language_model.model.layers.18.post_attention_layernorm.weight": "model-00005-of-00010.safetensors",
103
+ "language_model.model.layers.18.self_attn.k_proj.weight": "model-00005-of-00010.safetensors",
104
+ "language_model.model.layers.18.self_attn.o_proj.weight": "model-00005-of-00010.safetensors",
105
+ "language_model.model.layers.18.self_attn.q_proj.weight": "model-00005-of-00010.safetensors",
106
+ "language_model.model.layers.18.self_attn.v_proj.weight": "model-00005-of-00010.safetensors",
107
+ "language_model.model.layers.19.input_layernorm.weight": "model-00006-of-00010.safetensors",
108
+ "language_model.model.layers.19.mlp.down_proj.weight": "model-00006-of-00010.safetensors",
109
+ "language_model.model.layers.19.mlp.gate_proj.weight": "model-00005-of-00010.safetensors",
110
+ "language_model.model.layers.19.mlp.up_proj.weight": "model-00005-of-00010.safetensors",
111
+ "language_model.model.layers.19.post_attention_layernorm.weight": "model-00006-of-00010.safetensors",
112
+ "language_model.model.layers.19.self_attn.k_proj.weight": "model-00005-of-00010.safetensors",
113
+ "language_model.model.layers.19.self_attn.o_proj.weight": "model-00005-of-00010.safetensors",
114
+ "language_model.model.layers.19.self_attn.q_proj.weight": "model-00005-of-00010.safetensors",
115
+ "language_model.model.layers.19.self_attn.v_proj.weight": "model-00005-of-00010.safetensors",
116
+ "language_model.model.layers.2.input_layernorm.weight": "model-00002-of-00010.safetensors",
117
+ "language_model.model.layers.2.mlp.down_proj.weight": "model-00002-of-00010.safetensors",
118
+ "language_model.model.layers.2.mlp.gate_proj.weight": "model-00001-of-00010.safetensors",
119
+ "language_model.model.layers.2.mlp.up_proj.weight": "model-00002-of-00010.safetensors",
120
+ "language_model.model.layers.2.post_attention_layernorm.weight": "model-00002-of-00010.safetensors",
121
+ "language_model.model.layers.2.self_attn.k_proj.weight": "model-00001-of-00010.safetensors",
122
+ "language_model.model.layers.2.self_attn.o_proj.weight": "model-00001-of-00010.safetensors",
123
+ "language_model.model.layers.2.self_attn.q_proj.weight": "model-00001-of-00010.safetensors",
124
+ "language_model.model.layers.2.self_attn.v_proj.weight": "model-00001-of-00010.safetensors",
125
+ "language_model.model.layers.20.input_layernorm.weight": "model-00006-of-00010.safetensors",
126
+ "language_model.model.layers.20.mlp.down_proj.weight": "model-00006-of-00010.safetensors",
127
+ "language_model.model.layers.20.mlp.gate_proj.weight": "model-00006-of-00010.safetensors",
128
+ "language_model.model.layers.20.mlp.up_proj.weight": "model-00006-of-00010.safetensors",
129
+ "language_model.model.layers.20.post_attention_layernorm.weight": "model-00006-of-00010.safetensors",
130
+ "language_model.model.layers.20.self_attn.k_proj.weight": "model-00006-of-00010.safetensors",
131
+ "language_model.model.layers.20.self_attn.o_proj.weight": "model-00006-of-00010.safetensors",
132
+ "language_model.model.layers.20.self_attn.q_proj.weight": "model-00006-of-00010.safetensors",
133
+ "language_model.model.layers.20.self_attn.v_proj.weight": "model-00006-of-00010.safetensors",
134
+ "language_model.model.layers.21.input_layernorm.weight": "model-00006-of-00010.safetensors",
135
+ "language_model.model.layers.21.mlp.down_proj.weight": "model-00006-of-00010.safetensors",
136
+ "language_model.model.layers.21.mlp.gate_proj.weight": "model-00006-of-00010.safetensors",
137
+ "language_model.model.layers.21.mlp.up_proj.weight": "model-00006-of-00010.safetensors",
138
+ "language_model.model.layers.21.post_attention_layernorm.weight": "model-00006-of-00010.safetensors",
139
+ "language_model.model.layers.21.self_attn.k_proj.weight": "model-00006-of-00010.safetensors",
140
+ "language_model.model.layers.21.self_attn.o_proj.weight": "model-00006-of-00010.safetensors",
141
+ "language_model.model.layers.21.self_attn.q_proj.weight": "model-00006-of-00010.safetensors",
142
+ "language_model.model.layers.21.self_attn.v_proj.weight": "model-00006-of-00010.safetensors",
143
+ "language_model.model.layers.22.input_layernorm.weight": "model-00006-of-00010.safetensors",
144
+ "language_model.model.layers.22.mlp.down_proj.weight": "model-00006-of-00010.safetensors",
145
+ "language_model.model.layers.22.mlp.gate_proj.weight": "model-00006-of-00010.safetensors",
146
+ "language_model.model.layers.22.mlp.up_proj.weight": "model-00006-of-00010.safetensors",
147
+ "language_model.model.layers.22.post_attention_layernorm.weight": "model-00006-of-00010.safetensors",
148
+ "language_model.model.layers.22.self_attn.k_proj.weight": "model-00006-of-00010.safetensors",
149
+ "language_model.model.layers.22.self_attn.o_proj.weight": "model-00006-of-00010.safetensors",
150
+ "language_model.model.layers.22.self_attn.q_proj.weight": "model-00006-of-00010.safetensors",
151
+ "language_model.model.layers.22.self_attn.v_proj.weight": "model-00006-of-00010.safetensors",
152
+ "language_model.model.layers.23.input_layernorm.weight": "model-00006-of-00010.safetensors",
153
+ "language_model.model.layers.23.mlp.down_proj.weight": "model-00006-of-00010.safetensors",
154
+ "language_model.model.layers.23.mlp.gate_proj.weight": "model-00006-of-00010.safetensors",
155
+ "language_model.model.layers.23.mlp.up_proj.weight": "model-00006-of-00010.safetensors",
156
+ "language_model.model.layers.23.post_attention_layernorm.weight": "model-00006-of-00010.safetensors",
157
+ "language_model.model.layers.23.self_attn.k_proj.weight": "model-00006-of-00010.safetensors",
158
+ "language_model.model.layers.23.self_attn.o_proj.weight": "model-00006-of-00010.safetensors",
159
+ "language_model.model.layers.23.self_attn.q_proj.weight": "model-00006-of-00010.safetensors",
160
+ "language_model.model.layers.23.self_attn.v_proj.weight": "model-00006-of-00010.safetensors",
161
+ "language_model.model.layers.24.input_layernorm.weight": "model-00007-of-00010.safetensors",
162
+ "language_model.model.layers.24.mlp.down_proj.weight": "model-00007-of-00010.safetensors",
163
+ "language_model.model.layers.24.mlp.gate_proj.weight": "model-00007-of-00010.safetensors",
164
+ "language_model.model.layers.24.mlp.up_proj.weight": "model-00007-of-00010.safetensors",
165
+ "language_model.model.layers.24.post_attention_layernorm.weight": "model-00007-of-00010.safetensors",
166
+ "language_model.model.layers.24.self_attn.k_proj.weight": "model-00006-of-00010.safetensors",
167
+ "language_model.model.layers.24.self_attn.o_proj.weight": "model-00006-of-00010.safetensors",
168
+ "language_model.model.layers.24.self_attn.q_proj.weight": "model-00006-of-00010.safetensors",
169
+ "language_model.model.layers.24.self_attn.v_proj.weight": "model-00006-of-00010.safetensors",
170
+ "language_model.model.layers.25.input_layernorm.weight": "model-00007-of-00010.safetensors",
171
+ "language_model.model.layers.25.mlp.down_proj.weight": "model-00007-of-00010.safetensors",
172
+ "language_model.model.layers.25.mlp.gate_proj.weight": "model-00007-of-00010.safetensors",
173
+ "language_model.model.layers.25.mlp.up_proj.weight": "model-00007-of-00010.safetensors",
174
+ "language_model.model.layers.25.post_attention_layernorm.weight": "model-00007-of-00010.safetensors",
175
+ "language_model.model.layers.25.self_attn.k_proj.weight": "model-00007-of-00010.safetensors",
176
+ "language_model.model.layers.25.self_attn.o_proj.weight": "model-00007-of-00010.safetensors",
177
+ "language_model.model.layers.25.self_attn.q_proj.weight": "model-00007-of-00010.safetensors",
178
+ "language_model.model.layers.25.self_attn.v_proj.weight": "model-00007-of-00010.safetensors",
179
+ "language_model.model.layers.26.input_layernorm.weight": "model-00007-of-00010.safetensors",
180
+ "language_model.model.layers.26.mlp.down_proj.weight": "model-00007-of-00010.safetensors",
181
+ "language_model.model.layers.26.mlp.gate_proj.weight": "model-00007-of-00010.safetensors",
182
+ "language_model.model.layers.26.mlp.up_proj.weight": "model-00007-of-00010.safetensors",
183
+ "language_model.model.layers.26.post_attention_layernorm.weight": "model-00007-of-00010.safetensors",
184
+ "language_model.model.layers.26.self_attn.k_proj.weight": "model-00007-of-00010.safetensors",
185
+ "language_model.model.layers.26.self_attn.o_proj.weight": "model-00007-of-00010.safetensors",
186
+ "language_model.model.layers.26.self_attn.q_proj.weight": "model-00007-of-00010.safetensors",
187
+ "language_model.model.layers.26.self_attn.v_proj.weight": "model-00007-of-00010.safetensors",
188
+ "language_model.model.layers.27.input_layernorm.weight": "model-00007-of-00010.safetensors",
189
+ "language_model.model.layers.27.mlp.down_proj.weight": "model-00007-of-00010.safetensors",
190
+ "language_model.model.layers.27.mlp.gate_proj.weight": "model-00007-of-00010.safetensors",
191
+ "language_model.model.layers.27.mlp.up_proj.weight": "model-00007-of-00010.safetensors",
192
+ "language_model.model.layers.27.post_attention_layernorm.weight": "model-00007-of-00010.safetensors",
193
+ "language_model.model.layers.27.self_attn.k_proj.weight": "model-00007-of-00010.safetensors",
194
+ "language_model.model.layers.27.self_attn.o_proj.weight": "model-00007-of-00010.safetensors",
195
+ "language_model.model.layers.27.self_attn.q_proj.weight": "model-00007-of-00010.safetensors",
196
+ "language_model.model.layers.27.self_attn.v_proj.weight": "model-00007-of-00010.safetensors",
197
+ "language_model.model.layers.28.input_layernorm.weight": "model-00008-of-00010.safetensors",
198
+ "language_model.model.layers.28.mlp.down_proj.weight": "model-00008-of-00010.safetensors",
199
+ "language_model.model.layers.28.mlp.gate_proj.weight": "model-00007-of-00010.safetensors",
200
+ "language_model.model.layers.28.mlp.up_proj.weight": "model-00008-of-00010.safetensors",
201
+ "language_model.model.layers.28.post_attention_layernorm.weight": "model-00008-of-00010.safetensors",
202
+ "language_model.model.layers.28.self_attn.k_proj.weight": "model-00007-of-00010.safetensors",
203
+ "language_model.model.layers.28.self_attn.o_proj.weight": "model-00007-of-00010.safetensors",
204
+ "language_model.model.layers.28.self_attn.q_proj.weight": "model-00007-of-00010.safetensors",
205
+ "language_model.model.layers.28.self_attn.v_proj.weight": "model-00007-of-00010.safetensors",
206
+ "language_model.model.layers.29.input_layernorm.weight": "model-00008-of-00010.safetensors",
207
+ "language_model.model.layers.29.mlp.down_proj.weight": "model-00008-of-00010.safetensors",
208
+ "language_model.model.layers.29.mlp.gate_proj.weight": "model-00008-of-00010.safetensors",
209
+ "language_model.model.layers.29.mlp.up_proj.weight": "model-00008-of-00010.safetensors",
210
+ "language_model.model.layers.29.post_attention_layernorm.weight": "model-00008-of-00010.safetensors",
211
+ "language_model.model.layers.29.self_attn.k_proj.weight": "model-00008-of-00010.safetensors",
212
+ "language_model.model.layers.29.self_attn.o_proj.weight": "model-00008-of-00010.safetensors",
213
+ "language_model.model.layers.29.self_attn.q_proj.weight": "model-00008-of-00010.safetensors",
214
+ "language_model.model.layers.29.self_attn.v_proj.weight": "model-00008-of-00010.safetensors",
215
+ "language_model.model.layers.3.input_layernorm.weight": "model-00002-of-00010.safetensors",
216
+ "language_model.model.layers.3.mlp.down_proj.weight": "model-00002-of-00010.safetensors",
217
+ "language_model.model.layers.3.mlp.gate_proj.weight": "model-00002-of-00010.safetensors",
218
+ "language_model.model.layers.3.mlp.up_proj.weight": "model-00002-of-00010.safetensors",
219
+ "language_model.model.layers.3.post_attention_layernorm.weight": "model-00002-of-00010.safetensors",
220
+ "language_model.model.layers.3.self_attn.k_proj.weight": "model-00002-of-00010.safetensors",
221
+ "language_model.model.layers.3.self_attn.o_proj.weight": "model-00002-of-00010.safetensors",
222
+ "language_model.model.layers.3.self_attn.q_proj.weight": "model-00002-of-00010.safetensors",
223
+ "language_model.model.layers.3.self_attn.v_proj.weight": "model-00002-of-00010.safetensors",
224
+ "language_model.model.layers.30.input_layernorm.weight": "model-00008-of-00010.safetensors",
225
+ "language_model.model.layers.30.mlp.down_proj.weight": "model-00008-of-00010.safetensors",
226
+ "language_model.model.layers.30.mlp.gate_proj.weight": "model-00008-of-00010.safetensors",
227
+ "language_model.model.layers.30.mlp.up_proj.weight": "model-00008-of-00010.safetensors",
228
+ "language_model.model.layers.30.post_attention_layernorm.weight": "model-00008-of-00010.safetensors",
229
+ "language_model.model.layers.30.self_attn.k_proj.weight": "model-00008-of-00010.safetensors",
230
+ "language_model.model.layers.30.self_attn.o_proj.weight": "model-00008-of-00010.safetensors",
231
+ "language_model.model.layers.30.self_attn.q_proj.weight": "model-00008-of-00010.safetensors",
232
+ "language_model.model.layers.30.self_attn.v_proj.weight": "model-00008-of-00010.safetensors",
233
+ "language_model.model.layers.31.input_layernorm.weight": "model-00008-of-00010.safetensors",
234
+ "language_model.model.layers.31.mlp.down_proj.weight": "model-00008-of-00010.safetensors",
235
+ "language_model.model.layers.31.mlp.gate_proj.weight": "model-00008-of-00010.safetensors",
236
+ "language_model.model.layers.31.mlp.up_proj.weight": "model-00008-of-00010.safetensors",
237
+ "language_model.model.layers.31.post_attention_layernorm.weight": "model-00008-of-00010.safetensors",
238
+ "language_model.model.layers.31.self_attn.k_proj.weight": "model-00008-of-00010.safetensors",
239
+ "language_model.model.layers.31.self_attn.o_proj.weight": "model-00008-of-00010.safetensors",
240
+ "language_model.model.layers.31.self_attn.q_proj.weight": "model-00008-of-00010.safetensors",
241
+ "language_model.model.layers.31.self_attn.v_proj.weight": "model-00008-of-00010.safetensors",
242
+ "language_model.model.layers.32.input_layernorm.weight": "model-00009-of-00010.safetensors",
243
+ "language_model.model.layers.32.mlp.down_proj.weight": "model-00009-of-00010.safetensors",
244
+ "language_model.model.layers.32.mlp.gate_proj.weight": "model-00008-of-00010.safetensors",
245
+ "language_model.model.layers.32.mlp.up_proj.weight": "model-00008-of-00010.safetensors",
246
+ "language_model.model.layers.32.post_attention_layernorm.weight": "model-00009-of-00010.safetensors",
247
+ "language_model.model.layers.32.self_attn.k_proj.weight": "model-00008-of-00010.safetensors",
248
+ "language_model.model.layers.32.self_attn.o_proj.weight": "model-00008-of-00010.safetensors",
249
+ "language_model.model.layers.32.self_attn.q_proj.weight": "model-00008-of-00010.safetensors",
250
+ "language_model.model.layers.32.self_attn.v_proj.weight": "model-00008-of-00010.safetensors",
251
+ "language_model.model.layers.33.input_layernorm.weight": "model-00009-of-00010.safetensors",
252
+ "language_model.model.layers.33.mlp.down_proj.weight": "model-00009-of-00010.safetensors",
253
+ "language_model.model.layers.33.mlp.gate_proj.weight": "model-00009-of-00010.safetensors",
254
+ "language_model.model.layers.33.mlp.up_proj.weight": "model-00009-of-00010.safetensors",
255
+ "language_model.model.layers.33.post_attention_layernorm.weight": "model-00009-of-00010.safetensors",
256
+ "language_model.model.layers.33.self_attn.k_proj.weight": "model-00009-of-00010.safetensors",
257
+ "language_model.model.layers.33.self_attn.o_proj.weight": "model-00009-of-00010.safetensors",
258
+ "language_model.model.layers.33.self_attn.q_proj.weight": "model-00009-of-00010.safetensors",
259
+ "language_model.model.layers.33.self_attn.v_proj.weight": "model-00009-of-00010.safetensors",
260
+ "language_model.model.layers.34.input_layernorm.weight": "model-00009-of-00010.safetensors",
261
+ "language_model.model.layers.34.mlp.down_proj.weight": "model-00009-of-00010.safetensors",
262
+ "language_model.model.layers.34.mlp.gate_proj.weight": "model-00009-of-00010.safetensors",
263
+ "language_model.model.layers.34.mlp.up_proj.weight": "model-00009-of-00010.safetensors",
264
+ "language_model.model.layers.34.post_attention_layernorm.weight": "model-00009-of-00010.safetensors",
265
+ "language_model.model.layers.34.self_attn.k_proj.weight": "model-00009-of-00010.safetensors",
266
+ "language_model.model.layers.34.self_attn.o_proj.weight": "model-00009-of-00010.safetensors",
267
+ "language_model.model.layers.34.self_attn.q_proj.weight": "model-00009-of-00010.safetensors",
268
+ "language_model.model.layers.34.self_attn.v_proj.weight": "model-00009-of-00010.safetensors",
269
+ "language_model.model.layers.35.input_layernorm.weight": "model-00009-of-00010.safetensors",
270
+ "language_model.model.layers.35.mlp.down_proj.weight": "model-00009-of-00010.safetensors",
271
+ "language_model.model.layers.35.mlp.gate_proj.weight": "model-00009-of-00010.safetensors",
272
+ "language_model.model.layers.35.mlp.up_proj.weight": "model-00009-of-00010.safetensors",
273
+ "language_model.model.layers.35.post_attention_layernorm.weight": "model-00009-of-00010.safetensors",
274
+ "language_model.model.layers.35.self_attn.k_proj.weight": "model-00009-of-00010.safetensors",
275
+ "language_model.model.layers.35.self_attn.o_proj.weight": "model-00009-of-00010.safetensors",
276
+ "language_model.model.layers.35.self_attn.q_proj.weight": "model-00009-of-00010.safetensors",
277
+ "language_model.model.layers.35.self_attn.v_proj.weight": "model-00009-of-00010.safetensors",
278
+ "language_model.model.layers.36.input_layernorm.weight": "model-00009-of-00010.safetensors",
279
+ "language_model.model.layers.36.mlp.down_proj.weight": "model-00009-of-00010.safetensors",
280
+ "language_model.model.layers.36.mlp.gate_proj.weight": "model-00009-of-00010.safetensors",
281
+ "language_model.model.layers.36.mlp.up_proj.weight": "model-00009-of-00010.safetensors",
282
+ "language_model.model.layers.36.post_attention_layernorm.weight": "model-00009-of-00010.safetensors",
283
+ "language_model.model.layers.36.self_attn.k_proj.weight": "model-00009-of-00010.safetensors",
284
+ "language_model.model.layers.36.self_attn.o_proj.weight": "model-00009-of-00010.safetensors",
285
+ "language_model.model.layers.36.self_attn.q_proj.weight": "model-00009-of-00010.safetensors",
286
+ "language_model.model.layers.36.self_attn.v_proj.weight": "model-00009-of-00010.safetensors",
287
+ "language_model.model.layers.37.input_layernorm.weight": "model-00010-of-00010.safetensors",
288
+ "language_model.model.layers.37.mlp.down_proj.weight": "model-00010-of-00010.safetensors",
289
+ "language_model.model.layers.37.mlp.gate_proj.weight": "model-00010-of-00010.safetensors",
290
+ "language_model.model.layers.37.mlp.up_proj.weight": "model-00010-of-00010.safetensors",
291
+ "language_model.model.layers.37.post_attention_layernorm.weight": "model-00010-of-00010.safetensors",
292
+ "language_model.model.layers.37.self_attn.k_proj.weight": "model-00009-of-00010.safetensors",
293
+ "language_model.model.layers.37.self_attn.o_proj.weight": "model-00009-of-00010.safetensors",
294
+ "language_model.model.layers.37.self_attn.q_proj.weight": "model-00009-of-00010.safetensors",
295
+ "language_model.model.layers.37.self_attn.v_proj.weight": "model-00009-of-00010.safetensors",
296
+ "language_model.model.layers.38.input_layernorm.weight": "model-00010-of-00010.safetensors",
297
+ "language_model.model.layers.38.mlp.down_proj.weight": "model-00010-of-00010.safetensors",
298
+ "language_model.model.layers.38.mlp.gate_proj.weight": "model-00010-of-00010.safetensors",
299
+ "language_model.model.layers.38.mlp.up_proj.weight": "model-00010-of-00010.safetensors",
300
+ "language_model.model.layers.38.post_attention_layernorm.weight": "model-00010-of-00010.safetensors",
301
+ "language_model.model.layers.38.self_attn.k_proj.weight": "model-00010-of-00010.safetensors",
302
+ "language_model.model.layers.38.self_attn.o_proj.weight": "model-00010-of-00010.safetensors",
303
+ "language_model.model.layers.38.self_attn.q_proj.weight": "model-00010-of-00010.safetensors",
304
+ "language_model.model.layers.38.self_attn.v_proj.weight": "model-00010-of-00010.safetensors",
305
+ "language_model.model.layers.39.input_layernorm.weight": "model-00010-of-00010.safetensors",
306
+ "language_model.model.layers.39.mlp.down_proj.weight": "model-00010-of-00010.safetensors",
307
+ "language_model.model.layers.39.mlp.gate_proj.weight": "model-00010-of-00010.safetensors",
308
+ "language_model.model.layers.39.mlp.up_proj.weight": "model-00010-of-00010.safetensors",
309
+ "language_model.model.layers.39.post_attention_layernorm.weight": "model-00010-of-00010.safetensors",
310
+ "language_model.model.layers.39.self_attn.k_proj.weight": "model-00010-of-00010.safetensors",
311
+ "language_model.model.layers.39.self_attn.o_proj.weight": "model-00010-of-00010.safetensors",
312
+ "language_model.model.layers.39.self_attn.q_proj.weight": "model-00010-of-00010.safetensors",
313
+ "language_model.model.layers.39.self_attn.v_proj.weight": "model-00010-of-00010.safetensors",
314
+ "language_model.model.layers.4.input_layernorm.weight": "model-00002-of-00010.safetensors",
315
+ "language_model.model.layers.4.mlp.down_proj.weight": "model-00002-of-00010.safetensors",
316
+ "language_model.model.layers.4.mlp.gate_proj.weight": "model-00002-of-00010.safetensors",
317
+ "language_model.model.layers.4.mlp.up_proj.weight": "model-00002-of-00010.safetensors",
318
+ "language_model.model.layers.4.post_attention_layernorm.weight": "model-00002-of-00010.safetensors",
319
+ "language_model.model.layers.4.self_attn.k_proj.weight": "model-00002-of-00010.safetensors",
320
+ "language_model.model.layers.4.self_attn.o_proj.weight": "model-00002-of-00010.safetensors",
321
+ "language_model.model.layers.4.self_attn.q_proj.weight": "model-00002-of-00010.safetensors",
322
+ "language_model.model.layers.4.self_attn.v_proj.weight": "model-00002-of-00010.safetensors",
323
+ "language_model.model.layers.5.input_layernorm.weight": "model-00002-of-00010.safetensors",
324
+ "language_model.model.layers.5.mlp.down_proj.weight": "model-00002-of-00010.safetensors",
325
+ "language_model.model.layers.5.mlp.gate_proj.weight": "model-00002-of-00010.safetensors",
326
+ "language_model.model.layers.5.mlp.up_proj.weight": "model-00002-of-00010.safetensors",
327
+ "language_model.model.layers.5.post_attention_layernorm.weight": "model-00002-of-00010.safetensors",
328
+ "language_model.model.layers.5.self_attn.k_proj.weight": "model-00002-of-00010.safetensors",
329
+ "language_model.model.layers.5.self_attn.o_proj.weight": "model-00002-of-00010.safetensors",
330
+ "language_model.model.layers.5.self_attn.q_proj.weight": "model-00002-of-00010.safetensors",
331
+ "language_model.model.layers.5.self_attn.v_proj.weight": "model-00002-of-00010.safetensors",
332
+ "language_model.model.layers.6.input_layernorm.weight": "model-00003-of-00010.safetensors",
333
+ "language_model.model.layers.6.mlp.down_proj.weight": "model-00003-of-00010.safetensors",
334
+ "language_model.model.layers.6.mlp.gate_proj.weight": "model-00002-of-00010.safetensors",
335
+ "language_model.model.layers.6.mlp.up_proj.weight": "model-00002-of-00010.safetensors",
336
+ "language_model.model.layers.6.post_attention_layernorm.weight": "model-00003-of-00010.safetensors",
337
+ "language_model.model.layers.6.self_attn.k_proj.weight": "model-00002-of-00010.safetensors",
338
+ "language_model.model.layers.6.self_attn.o_proj.weight": "model-00002-of-00010.safetensors",
339
+ "language_model.model.layers.6.self_attn.q_proj.weight": "model-00002-of-00010.safetensors",
340
+ "language_model.model.layers.6.self_attn.v_proj.weight": "model-00002-of-00010.safetensors",
341
+ "language_model.model.layers.7.input_layernorm.weight": "model-00003-of-00010.safetensors",
342
+ "language_model.model.layers.7.mlp.down_proj.weight": "model-00003-of-00010.safetensors",
343
+ "language_model.model.layers.7.mlp.gate_proj.weight": "model-00003-of-00010.safetensors",
344
+ "language_model.model.layers.7.mlp.up_proj.weight": "model-00003-of-00010.safetensors",
345
+ "language_model.model.layers.7.post_attention_layernorm.weight": "model-00003-of-00010.safetensors",
346
+ "language_model.model.layers.7.self_attn.k_proj.weight": "model-00003-of-00010.safetensors",
347
+ "language_model.model.layers.7.self_attn.o_proj.weight": "model-00003-of-00010.safetensors",
348
+ "language_model.model.layers.7.self_attn.q_proj.weight": "model-00003-of-00010.safetensors",
349
+ "language_model.model.layers.7.self_attn.v_proj.weight": "model-00003-of-00010.safetensors",
350
+ "language_model.model.layers.8.input_layernorm.weight": "model-00003-of-00010.safetensors",
351
+ "language_model.model.layers.8.mlp.down_proj.weight": "model-00003-of-00010.safetensors",
352
+ "language_model.model.layers.8.mlp.gate_proj.weight": "model-00003-of-00010.safetensors",
353
+ "language_model.model.layers.8.mlp.up_proj.weight": "model-00003-of-00010.safetensors",
354
+ "language_model.model.layers.8.post_attention_layernorm.weight": "model-00003-of-00010.safetensors",
355
+ "language_model.model.layers.8.self_attn.k_proj.weight": "model-00003-of-00010.safetensors",
356
+ "language_model.model.layers.8.self_attn.o_proj.weight": "model-00003-of-00010.safetensors",
357
+ "language_model.model.layers.8.self_attn.q_proj.weight": "model-00003-of-00010.safetensors",
358
+ "language_model.model.layers.8.self_attn.v_proj.weight": "model-00003-of-00010.safetensors",
359
+ "language_model.model.layers.9.input_layernorm.weight": "model-00003-of-00010.safetensors",
360
+ "language_model.model.layers.9.mlp.down_proj.weight": "model-00003-of-00010.safetensors",
361
+ "language_model.model.layers.9.mlp.gate_proj.weight": "model-00003-of-00010.safetensors",
362
+ "language_model.model.layers.9.mlp.up_proj.weight": "model-00003-of-00010.safetensors",
363
+ "language_model.model.layers.9.post_attention_layernorm.weight": "model-00003-of-00010.safetensors",
364
+ "language_model.model.layers.9.self_attn.k_proj.weight": "model-00003-of-00010.safetensors",
365
+ "language_model.model.layers.9.self_attn.o_proj.weight": "model-00003-of-00010.safetensors",
366
+ "language_model.model.layers.9.self_attn.q_proj.weight": "model-00003-of-00010.safetensors",
367
+ "language_model.model.layers.9.self_attn.v_proj.weight": "model-00003-of-00010.safetensors",
368
+ "language_model.model.norm.weight": "model-00010-of-00010.safetensors",
369
+ "multi_modal_projector.linear_1.weight": "model-00001-of-00010.safetensors",
370
+ "multi_modal_projector.linear_2.weight": "model-00001-of-00010.safetensors",
371
+ "multi_modal_projector.norm.weight": "model-00001-of-00010.safetensors",
372
+ "multi_modal_projector.patch_merger.merging_layer.weight": "model-00001-of-00010.safetensors",
373
+ "vision_tower.ln_pre.weight": "model-00001-of-00010.safetensors",
374
+ "vision_tower.patch_conv.weight": "model-00001-of-00010.safetensors",
375
+ "vision_tower.transformer.layers.0.attention.k_proj.weight": "model-00001-of-00010.safetensors",
376
+ "vision_tower.transformer.layers.0.attention.o_proj.weight": "model-00001-of-00010.safetensors",
377
+ "vision_tower.transformer.layers.0.attention.q_proj.weight": "model-00001-of-00010.safetensors",
378
+ "vision_tower.transformer.layers.0.attention.v_proj.weight": "model-00001-of-00010.safetensors",
379
+ "vision_tower.transformer.layers.0.attention_norm.weight": "model-00001-of-00010.safetensors",
380
+ "vision_tower.transformer.layers.0.feed_forward.down_proj.weight": "model-00001-of-00010.safetensors",
381
+ "vision_tower.transformer.layers.0.feed_forward.gate_proj.weight": "model-00001-of-00010.safetensors",
382
+ "vision_tower.transformer.layers.0.feed_forward.up_proj.weight": "model-00001-of-00010.safetensors",
383
+ "vision_tower.transformer.layers.0.ffn_norm.weight": "model-00001-of-00010.safetensors",
384
+ "vision_tower.transformer.layers.1.attention.k_proj.weight": "model-00001-of-00010.safetensors",
385
+ "vision_tower.transformer.layers.1.attention.o_proj.weight": "model-00001-of-00010.safetensors",
386
+ "vision_tower.transformer.layers.1.attention.q_proj.weight": "model-00001-of-00010.safetensors",
387
+ "vision_tower.transformer.layers.1.attention.v_proj.weight": "model-00001-of-00010.safetensors",
388
+ "vision_tower.transformer.layers.1.attention_norm.weight": "model-00001-of-00010.safetensors",
389
+ "vision_tower.transformer.layers.1.feed_forward.down_proj.weight": "model-00001-of-00010.safetensors",
390
+ "vision_tower.transformer.layers.1.feed_forward.gate_proj.weight": "model-00001-of-00010.safetensors",
391
+ "vision_tower.transformer.layers.1.feed_forward.up_proj.weight": "model-00001-of-00010.safetensors",
392
+ "vision_tower.transformer.layers.1.ffn_norm.weight": "model-00001-of-00010.safetensors",
393
+ "vision_tower.transformer.layers.10.attention.k_proj.weight": "model-00001-of-00010.safetensors",
394
+ "vision_tower.transformer.layers.10.attention.o_proj.weight": "model-00001-of-00010.safetensors",
395
+ "vision_tower.transformer.layers.10.attention.q_proj.weight": "model-00001-of-00010.safetensors",
396
+ "vision_tower.transformer.layers.10.attention.v_proj.weight": "model-00001-of-00010.safetensors",
397
+ "vision_tower.transformer.layers.10.attention_norm.weight": "model-00001-of-00010.safetensors",
398
+ "vision_tower.transformer.layers.10.feed_forward.down_proj.weight": "model-00001-of-00010.safetensors",
399
+ "vision_tower.transformer.layers.10.feed_forward.gate_proj.weight": "model-00001-of-00010.safetensors",
400
+ "vision_tower.transformer.layers.10.feed_forward.up_proj.weight": "model-00001-of-00010.safetensors",
401
+ "vision_tower.transformer.layers.10.ffn_norm.weight": "model-00001-of-00010.safetensors",
402
+ "vision_tower.transformer.layers.11.attention.k_proj.weight": "model-00001-of-00010.safetensors",
403
+ "vision_tower.transformer.layers.11.attention.o_proj.weight": "model-00001-of-00010.safetensors",
404
+ "vision_tower.transformer.layers.11.attention.q_proj.weight": "model-00001-of-00010.safetensors",
405
+ "vision_tower.transformer.layers.11.attention.v_proj.weight": "model-00001-of-00010.safetensors",
406
+ "vision_tower.transformer.layers.11.attention_norm.weight": "model-00001-of-00010.safetensors",
407
+ "vision_tower.transformer.layers.11.feed_forward.down_proj.weight": "model-00001-of-00010.safetensors",
408
+ "vision_tower.transformer.layers.11.feed_forward.gate_proj.weight": "model-00001-of-00010.safetensors",
409
+ "vision_tower.transformer.layers.11.feed_forward.up_proj.weight": "model-00001-of-00010.safetensors",
410
+ "vision_tower.transformer.layers.11.ffn_norm.weight": "model-00001-of-00010.safetensors",
411
+ "vision_tower.transformer.layers.12.attention.k_proj.weight": "model-00001-of-00010.safetensors",
412
+ "vision_tower.transformer.layers.12.attention.o_proj.weight": "model-00001-of-00010.safetensors",
413
+ "vision_tower.transformer.layers.12.attention.q_proj.weight": "model-00001-of-00010.safetensors",
414
+ "vision_tower.transformer.layers.12.attention.v_proj.weight": "model-00001-of-00010.safetensors",
415
+ "vision_tower.transformer.layers.12.attention_norm.weight": "model-00001-of-00010.safetensors",
416
+ "vision_tower.transformer.layers.12.feed_forward.down_proj.weight": "model-00001-of-00010.safetensors",
417
+ "vision_tower.transformer.layers.12.feed_forward.gate_proj.weight": "model-00001-of-00010.safetensors",
418
+ "vision_tower.transformer.layers.12.feed_forward.up_proj.weight": "model-00001-of-00010.safetensors",
419
+ "vision_tower.transformer.layers.12.ffn_norm.weight": "model-00001-of-00010.safetensors",
420
+ "vision_tower.transformer.layers.13.attention.k_proj.weight": "model-00001-of-00010.safetensors",
421
+ "vision_tower.transformer.layers.13.attention.o_proj.weight": "model-00001-of-00010.safetensors",
422
+ "vision_tower.transformer.layers.13.attention.q_proj.weight": "model-00001-of-00010.safetensors",
423
+ "vision_tower.transformer.layers.13.attention.v_proj.weight": "model-00001-of-00010.safetensors",
424
+ "vision_tower.transformer.layers.13.attention_norm.weight": "model-00001-of-00010.safetensors",
425
+ "vision_tower.transformer.layers.13.feed_forward.down_proj.weight": "model-00001-of-00010.safetensors",
426
+ "vision_tower.transformer.layers.13.feed_forward.gate_proj.weight": "model-00001-of-00010.safetensors",
427
+ "vision_tower.transformer.layers.13.feed_forward.up_proj.weight": "model-00001-of-00010.safetensors",
428
+ "vision_tower.transformer.layers.13.ffn_norm.weight": "model-00001-of-00010.safetensors",
429
+ "vision_tower.transformer.layers.14.attention.k_proj.weight": "model-00001-of-00010.safetensors",
430
+ "vision_tower.transformer.layers.14.attention.o_proj.weight": "model-00001-of-00010.safetensors",
431
+ "vision_tower.transformer.layers.14.attention.q_proj.weight": "model-00001-of-00010.safetensors",
432
+ "vision_tower.transformer.layers.14.attention.v_proj.weight": "model-00001-of-00010.safetensors",
433
+ "vision_tower.transformer.layers.14.attention_norm.weight": "model-00001-of-00010.safetensors",
434
+ "vision_tower.transformer.layers.14.feed_forward.down_proj.weight": "model-00001-of-00010.safetensors",
435
+ "vision_tower.transformer.layers.14.feed_forward.gate_proj.weight": "model-00001-of-00010.safetensors",
436
+ "vision_tower.transformer.layers.14.feed_forward.up_proj.weight": "model-00001-of-00010.safetensors",
437
+ "vision_tower.transformer.layers.14.ffn_norm.weight": "model-00001-of-00010.safetensors",
438
+ "vision_tower.transformer.layers.15.attention.k_proj.weight": "model-00001-of-00010.safetensors",
439
+ "vision_tower.transformer.layers.15.attention.o_proj.weight": "model-00001-of-00010.safetensors",
440
+ "vision_tower.transformer.layers.15.attention.q_proj.weight": "model-00001-of-00010.safetensors",
441
+ "vision_tower.transformer.layers.15.attention.v_proj.weight": "model-00001-of-00010.safetensors",
442
+ "vision_tower.transformer.layers.15.attention_norm.weight": "model-00001-of-00010.safetensors",
443
+ "vision_tower.transformer.layers.15.feed_forward.down_proj.weight": "model-00001-of-00010.safetensors",
444
+ "vision_tower.transformer.layers.15.feed_forward.gate_proj.weight": "model-00001-of-00010.safetensors",
445
+ "vision_tower.transformer.layers.15.feed_forward.up_proj.weight": "model-00001-of-00010.safetensors",
446
+ "vision_tower.transformer.layers.15.ffn_norm.weight": "model-00001-of-00010.safetensors",
447
+ "vision_tower.transformer.layers.16.attention.k_proj.weight": "model-00001-of-00010.safetensors",
448
+ "vision_tower.transformer.layers.16.attention.o_proj.weight": "model-00001-of-00010.safetensors",
449
+ "vision_tower.transformer.layers.16.attention.q_proj.weight": "model-00001-of-00010.safetensors",
450
+ "vision_tower.transformer.layers.16.attention.v_proj.weight": "model-00001-of-00010.safetensors",
451
+ "vision_tower.transformer.layers.16.attention_norm.weight": "model-00001-of-00010.safetensors",
452
+ "vision_tower.transformer.layers.16.feed_forward.down_proj.weight": "model-00001-of-00010.safetensors",
453
+ "vision_tower.transformer.layers.16.feed_forward.gate_proj.weight": "model-00001-of-00010.safetensors",
454
+ "vision_tower.transformer.layers.16.feed_forward.up_proj.weight": "model-00001-of-00010.safetensors",
455
+ "vision_tower.transformer.layers.16.ffn_norm.weight": "model-00001-of-00010.safetensors",
456
+ "vision_tower.transformer.layers.17.attention.k_proj.weight": "model-00001-of-00010.safetensors",
457
+ "vision_tower.transformer.layers.17.attention.o_proj.weight": "model-00001-of-00010.safetensors",
458
+ "vision_tower.transformer.layers.17.attention.q_proj.weight": "model-00001-of-00010.safetensors",
459
+ "vision_tower.transformer.layers.17.attention.v_proj.weight": "model-00001-of-00010.safetensors",
460
+ "vision_tower.transformer.layers.17.attention_norm.weight": "model-00001-of-00010.safetensors",
461
+ "vision_tower.transformer.layers.17.feed_forward.down_proj.weight": "model-00001-of-00010.safetensors",
462
+ "vision_tower.transformer.layers.17.feed_forward.gate_proj.weight": "model-00001-of-00010.safetensors",
463
+ "vision_tower.transformer.layers.17.feed_forward.up_proj.weight": "model-00001-of-00010.safetensors",
464
+ "vision_tower.transformer.layers.17.ffn_norm.weight": "model-00001-of-00010.safetensors",
465
+ "vision_tower.transformer.layers.18.attention.k_proj.weight": "model-00001-of-00010.safetensors",
466
+ "vision_tower.transformer.layers.18.attention.o_proj.weight": "model-00001-of-00010.safetensors",
467
+ "vision_tower.transformer.layers.18.attention.q_proj.weight": "model-00001-of-00010.safetensors",
468
+ "vision_tower.transformer.layers.18.attention.v_proj.weight": "model-00001-of-00010.safetensors",
469
+ "vision_tower.transformer.layers.18.attention_norm.weight": "model-00001-of-00010.safetensors",
470
+ "vision_tower.transformer.layers.18.feed_forward.down_proj.weight": "model-00001-of-00010.safetensors",
471
+ "vision_tower.transformer.layers.18.feed_forward.gate_proj.weight": "model-00001-of-00010.safetensors",
472
+ "vision_tower.transformer.layers.18.feed_forward.up_proj.weight": "model-00001-of-00010.safetensors",
473
+ "vision_tower.transformer.layers.18.ffn_norm.weight": "model-00001-of-00010.safetensors",
474
+ "vision_tower.transformer.layers.19.attention.k_proj.weight": "model-00001-of-00010.safetensors",
475
+ "vision_tower.transformer.layers.19.attention.o_proj.weight": "model-00001-of-00010.safetensors",
476
+ "vision_tower.transformer.layers.19.attention.q_proj.weight": "model-00001-of-00010.safetensors",
477
+ "vision_tower.transformer.layers.19.attention.v_proj.weight": "model-00001-of-00010.safetensors",
478
+ "vision_tower.transformer.layers.19.attention_norm.weight": "model-00001-of-00010.safetensors",
479
+ "vision_tower.transformer.layers.19.feed_forward.down_proj.weight": "model-00001-of-00010.safetensors",
480
+ "vision_tower.transformer.layers.19.feed_forward.gate_proj.weight": "model-00001-of-00010.safetensors",
481
+ "vision_tower.transformer.layers.19.feed_forward.up_proj.weight": "model-00001-of-00010.safetensors",
482
+ "vision_tower.transformer.layers.19.ffn_norm.weight": "model-00001-of-00010.safetensors",
483
+ "vision_tower.transformer.layers.2.attention.k_proj.weight": "model-00001-of-00010.safetensors",
484
+ "vision_tower.transformer.layers.2.attention.o_proj.weight": "model-00001-of-00010.safetensors",
485
+ "vision_tower.transformer.layers.2.attention.q_proj.weight": "model-00001-of-00010.safetensors",
486
+ "vision_tower.transformer.layers.2.attention.v_proj.weight": "model-00001-of-00010.safetensors",
487
+ "vision_tower.transformer.layers.2.attention_norm.weight": "model-00001-of-00010.safetensors",
488
+ "vision_tower.transformer.layers.2.feed_forward.down_proj.weight": "model-00001-of-00010.safetensors",
489
+ "vision_tower.transformer.layers.2.feed_forward.gate_proj.weight": "model-00001-of-00010.safetensors",
490
+ "vision_tower.transformer.layers.2.feed_forward.up_proj.weight": "model-00001-of-00010.safetensors",
491
+ "vision_tower.transformer.layers.2.ffn_norm.weight": "model-00001-of-00010.safetensors",
492
+ "vision_tower.transformer.layers.20.attention.k_proj.weight": "model-00001-of-00010.safetensors",
493
+ "vision_tower.transformer.layers.20.attention.o_proj.weight": "model-00001-of-00010.safetensors",
494
+ "vision_tower.transformer.layers.20.attention.q_proj.weight": "model-00001-of-00010.safetensors",
495
+ "vision_tower.transformer.layers.20.attention.v_proj.weight": "model-00001-of-00010.safetensors",
496
+ "vision_tower.transformer.layers.20.attention_norm.weight": "model-00001-of-00010.safetensors",
497
+ "vision_tower.transformer.layers.20.feed_forward.down_proj.weight": "model-00001-of-00010.safetensors",
498
+ "vision_tower.transformer.layers.20.feed_forward.gate_proj.weight": "model-00001-of-00010.safetensors",
499
+ "vision_tower.transformer.layers.20.feed_forward.up_proj.weight": "model-00001-of-00010.safetensors",
500
+ "vision_tower.transformer.layers.20.ffn_norm.weight": "model-00001-of-00010.safetensors",
501
+ "vision_tower.transformer.layers.21.attention.k_proj.weight": "model-00001-of-00010.safetensors",
502
+ "vision_tower.transformer.layers.21.attention.o_proj.weight": "model-00001-of-00010.safetensors",
503
+ "vision_tower.transformer.layers.21.attention.q_proj.weight": "model-00001-of-00010.safetensors",
504
+ "vision_tower.transformer.layers.21.attention.v_proj.weight": "model-00001-of-00010.safetensors",
505
+ "vision_tower.transformer.layers.21.attention_norm.weight": "model-00001-of-00010.safetensors",
506
+ "vision_tower.transformer.layers.21.feed_forward.down_proj.weight": "model-00001-of-00010.safetensors",
507
+ "vision_tower.transformer.layers.21.feed_forward.gate_proj.weight": "model-00001-of-00010.safetensors",
508
+ "vision_tower.transformer.layers.21.feed_forward.up_proj.weight": "model-00001-of-00010.safetensors",
509
+ "vision_tower.transformer.layers.21.ffn_norm.weight": "model-00001-of-00010.safetensors",
510
+ "vision_tower.transformer.layers.22.attention.k_proj.weight": "model-00001-of-00010.safetensors",
511
+ "vision_tower.transformer.layers.22.attention.o_proj.weight": "model-00001-of-00010.safetensors",
512
+ "vision_tower.transformer.layers.22.attention.q_proj.weight": "model-00001-of-00010.safetensors",
513
+ "vision_tower.transformer.layers.22.attention.v_proj.weight": "model-00001-of-00010.safetensors",
514
+ "vision_tower.transformer.layers.22.attention_norm.weight": "model-00001-of-00010.safetensors",
515
+ "vision_tower.transformer.layers.22.feed_forward.down_proj.weight": "model-00001-of-00010.safetensors",
516
+ "vision_tower.transformer.layers.22.feed_forward.gate_proj.weight": "model-00001-of-00010.safetensors",
517
+ "vision_tower.transformer.layers.22.feed_forward.up_proj.weight": "model-00001-of-00010.safetensors",
518
+ "vision_tower.transformer.layers.22.ffn_norm.weight": "model-00001-of-00010.safetensors",
519
+ "vision_tower.transformer.layers.23.attention.k_proj.weight": "model-00001-of-00010.safetensors",
520
+ "vision_tower.transformer.layers.23.attention.o_proj.weight": "model-00001-of-00010.safetensors",
521
+ "vision_tower.transformer.layers.23.attention.q_proj.weight": "model-00001-of-00010.safetensors",
522
+ "vision_tower.transformer.layers.23.attention.v_proj.weight": "model-00001-of-00010.safetensors",
523
+ "vision_tower.transformer.layers.23.attention_norm.weight": "model-00001-of-00010.safetensors",
524
+ "vision_tower.transformer.layers.23.feed_forward.down_proj.weight": "model-00001-of-00010.safetensors",
525
+ "vision_tower.transformer.layers.23.feed_forward.gate_proj.weight": "model-00001-of-00010.safetensors",
526
+ "vision_tower.transformer.layers.23.feed_forward.up_proj.weight": "model-00001-of-00010.safetensors",
527
+ "vision_tower.transformer.layers.23.ffn_norm.weight": "model-00001-of-00010.safetensors",
528
+ "vision_tower.transformer.layers.3.attention.k_proj.weight": "model-00001-of-00010.safetensors",
529
+ "vision_tower.transformer.layers.3.attention.o_proj.weight": "model-00001-of-00010.safetensors",
530
+ "vision_tower.transformer.layers.3.attention.q_proj.weight": "model-00001-of-00010.safetensors",
531
+ "vision_tower.transformer.layers.3.attention.v_proj.weight": "model-00001-of-00010.safetensors",
532
+ "vision_tower.transformer.layers.3.attention_norm.weight": "model-00001-of-00010.safetensors",
533
+ "vision_tower.transformer.layers.3.feed_forward.down_proj.weight": "model-00001-of-00010.safetensors",
534
+ "vision_tower.transformer.layers.3.feed_forward.gate_proj.weight": "model-00001-of-00010.safetensors",
535
+ "vision_tower.transformer.layers.3.feed_forward.up_proj.weight": "model-00001-of-00010.safetensors",
536
+ "vision_tower.transformer.layers.3.ffn_norm.weight": "model-00001-of-00010.safetensors",
537
+ "vision_tower.transformer.layers.4.attention.k_proj.weight": "model-00001-of-00010.safetensors",
538
+ "vision_tower.transformer.layers.4.attention.o_proj.weight": "model-00001-of-00010.safetensors",
539
+ "vision_tower.transformer.layers.4.attention.q_proj.weight": "model-00001-of-00010.safetensors",
540
+ "vision_tower.transformer.layers.4.attention.v_proj.weight": "model-00001-of-00010.safetensors",
541
+ "vision_tower.transformer.layers.4.attention_norm.weight": "model-00001-of-00010.safetensors",
542
+ "vision_tower.transformer.layers.4.feed_forward.down_proj.weight": "model-00001-of-00010.safetensors",
543
+ "vision_tower.transformer.layers.4.feed_forward.gate_proj.weight": "model-00001-of-00010.safetensors",
544
+ "vision_tower.transformer.layers.4.feed_forward.up_proj.weight": "model-00001-of-00010.safetensors",
545
+ "vision_tower.transformer.layers.4.ffn_norm.weight": "model-00001-of-00010.safetensors",
546
+ "vision_tower.transformer.layers.5.attention.k_proj.weight": "model-00001-of-00010.safetensors",
547
+ "vision_tower.transformer.layers.5.attention.o_proj.weight": "model-00001-of-00010.safetensors",
548
+ "vision_tower.transformer.layers.5.attention.q_proj.weight": "model-00001-of-00010.safetensors",
549
+ "vision_tower.transformer.layers.5.attention.v_proj.weight": "model-00001-of-00010.safetensors",
550
+ "vision_tower.transformer.layers.5.attention_norm.weight": "model-00001-of-00010.safetensors",
551
+ "vision_tower.transformer.layers.5.feed_forward.down_proj.weight": "model-00001-of-00010.safetensors",
552
+ "vision_tower.transformer.layers.5.feed_forward.gate_proj.weight": "model-00001-of-00010.safetensors",
553
+ "vision_tower.transformer.layers.5.feed_forward.up_proj.weight": "model-00001-of-00010.safetensors",
554
+ "vision_tower.transformer.layers.5.ffn_norm.weight": "model-00001-of-00010.safetensors",
555
+ "vision_tower.transformer.layers.6.attention.k_proj.weight": "model-00001-of-00010.safetensors",
556
+ "vision_tower.transformer.layers.6.attention.o_proj.weight": "model-00001-of-00010.safetensors",
557
+ "vision_tower.transformer.layers.6.attention.q_proj.weight": "model-00001-of-00010.safetensors",
558
+ "vision_tower.transformer.layers.6.attention.v_proj.weight": "model-00001-of-00010.safetensors",
559
+ "vision_tower.transformer.layers.6.attention_norm.weight": "model-00001-of-00010.safetensors",
560
+ "vision_tower.transformer.layers.6.feed_forward.down_proj.weight": "model-00001-of-00010.safetensors",
561
+ "vision_tower.transformer.layers.6.feed_forward.gate_proj.weight": "model-00001-of-00010.safetensors",
562
+ "vision_tower.transformer.layers.6.feed_forward.up_proj.weight": "model-00001-of-00010.safetensors",
563
+ "vision_tower.transformer.layers.6.ffn_norm.weight": "model-00001-of-00010.safetensors",
564
+ "vision_tower.transformer.layers.7.attention.k_proj.weight": "model-00001-of-00010.safetensors",
565
+ "vision_tower.transformer.layers.7.attention.o_proj.weight": "model-00001-of-00010.safetensors",
566
+ "vision_tower.transformer.layers.7.attention.q_proj.weight": "model-00001-of-00010.safetensors",
567
+ "vision_tower.transformer.layers.7.attention.v_proj.weight": "model-00001-of-00010.safetensors",
568
+ "vision_tower.transformer.layers.7.attention_norm.weight": "model-00001-of-00010.safetensors",
569
+ "vision_tower.transformer.layers.7.feed_forward.down_proj.weight": "model-00001-of-00010.safetensors",
570
+ "vision_tower.transformer.layers.7.feed_forward.gate_proj.weight": "model-00001-of-00010.safetensors",
571
+ "vision_tower.transformer.layers.7.feed_forward.up_proj.weight": "model-00001-of-00010.safetensors",
572
+ "vision_tower.transformer.layers.7.ffn_norm.weight": "model-00001-of-00010.safetensors",
573
+ "vision_tower.transformer.layers.8.attention.k_proj.weight": "model-00001-of-00010.safetensors",
574
+ "vision_tower.transformer.layers.8.attention.o_proj.weight": "model-00001-of-00010.safetensors",
575
+ "vision_tower.transformer.layers.8.attention.q_proj.weight": "model-00001-of-00010.safetensors",
576
+ "vision_tower.transformer.layers.8.attention.v_proj.weight": "model-00001-of-00010.safetensors",
577
+ "vision_tower.transformer.layers.8.attention_norm.weight": "model-00001-of-00010.safetensors",
578
+ "vision_tower.transformer.layers.8.feed_forward.down_proj.weight": "model-00001-of-00010.safetensors",
579
+ "vision_tower.transformer.layers.8.feed_forward.gate_proj.weight": "model-00001-of-00010.safetensors",
580
+ "vision_tower.transformer.layers.8.feed_forward.up_proj.weight": "model-00001-of-00010.safetensors",
581
+ "vision_tower.transformer.layers.8.ffn_norm.weight": "model-00001-of-00010.safetensors",
582
+ "vision_tower.transformer.layers.9.attention.k_proj.weight": "model-00001-of-00010.safetensors",
583
+ "vision_tower.transformer.layers.9.attention.o_proj.weight": "model-00001-of-00010.safetensors",
584
+ "vision_tower.transformer.layers.9.attention.q_proj.weight": "model-00001-of-00010.safetensors",
585
+ "vision_tower.transformer.layers.9.attention.v_proj.weight": "model-00001-of-00010.safetensors",
586
+ "vision_tower.transformer.layers.9.attention_norm.weight": "model-00001-of-00010.safetensors",
587
+ "vision_tower.transformer.layers.9.feed_forward.down_proj.weight": "model-00001-of-00010.safetensors",
588
+ "vision_tower.transformer.layers.9.feed_forward.gate_proj.weight": "model-00001-of-00010.safetensors",
589
+ "vision_tower.transformer.layers.9.feed_forward.up_proj.weight": "model-00001-of-00010.safetensors",
590
+ "vision_tower.transformer.layers.9.ffn_norm.weight": "model-00001-of-00010.safetensors"
591
+ }
592
+ }
params.json ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dim": 5120,
3
+ "n_layers": 40,
4
+ "head_dim": 128,
5
+ "hidden_dim": 32768,
6
+ "n_heads": 32,
7
+ "n_kv_heads": 8,
8
+ "rope_theta": 1000000000.0,
9
+ "norm_eps": 1e-05,
10
+ "vocab_size": 131072,
11
+ "vision_encoder": {
12
+ "hidden_size": 1024,
13
+ "num_channels": 3,
14
+ "max_image_size": 1540,
15
+ "patch_size": 14,
16
+ "rope_theta": 10000.0,
17
+ "intermediate_size": 4096,
18
+ "num_hidden_layers": 24,
19
+ "num_attention_heads": 16,
20
+ "adapter_bias": false,
21
+ "mm_projector_id": "patch_merge",
22
+ "spatial_merge_size": 2,
23
+ "add_pre_mm_projector_layer_norm": true,
24
+ "image_token_id": 10,
25
+ "image_break_token_id": 12,
26
+ "image_end_token_id": 13,
27
+ "image_size": 1540
28
+ }
29
+ }
preprocessor_config.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "crop_size": null,
3
+ "data_format": "channels_first",
4
+ "default_to_square": true,
5
+ "device": null,
6
+ "do_center_crop": null,
7
+ "do_convert_rgb": true,
8
+ "do_normalize": true,
9
+ "do_rescale": true,
10
+ "do_resize": true,
11
+ "image_mean": [
12
+ 0.48145466,
13
+ 0.4578275,
14
+ 0.40821073
15
+ ],
16
+ "image_processor_type": "PixtralImageProcessorFast",
17
+ "image_std": [
18
+ 0.26862954,
19
+ 0.26130258,
20
+ 0.27577711
21
+ ],
22
+ "input_data_format": null,
23
+ "patch_size": 14,
24
+ "processor_class": "PixtralProcessor",
25
+ "resample": 3,
26
+ "rescale_factor": 0.00392156862745098,
27
+ "return_tensors": null,
28
+ "size": {
29
+ "longest_edge": 1540
30
+ }
31
+ }
processor_config.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "image_break_token": "[IMG_BREAK]",
3
+ "image_end_token": "[IMG_END]",
4
+ "image_token": "[IMG]",
5
+ "patch_size": 14,
6
+ "processor_class": "PixtralProcessor",
7
+ "spatial_merge_size": 2
8
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,1032 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ "<unk>",
4
+ "<s>",
5
+ "</s>",
6
+ "[INST]",
7
+ "[/INST]",
8
+ "[AVAILABLE_TOOLS]",
9
+ "[/AVAILABLE_TOOLS]",
10
+ "[TOOL_RESULTS]",
11
+ "[/TOOL_RESULTS]",
12
+ "[TOOL_CALLS]",
13
+ "[IMG]",
14
+ "<pad>",
15
+ "[IMG_BREAK]",
16
+ "[IMG_END]",
17
+ "[PREFIX]",
18
+ "[MIDDLE]",
19
+ "[SUFFIX]",
20
+ "[SYSTEM_PROMPT]",
21
+ "[/SYSTEM_PROMPT]",
22
+ "[TOOL_CONTENT]",
23
+ "<SPECIAL_20>",
24
+ "<SPECIAL_21>",
25
+ "<SPECIAL_22>",
26
+ "<SPECIAL_23>",
27
+ "<SPECIAL_24>",
28
+ "<SPECIAL_25>",
29
+ "<SPECIAL_26>",
30
+ "<SPECIAL_27>",
31
+ "<SPECIAL_28>",
32
+ "<SPECIAL_29>",
33
+ "<SPECIAL_30>",
34
+ "<SPECIAL_31>",
35
+ "<SPECIAL_32>",
36
+ "<SPECIAL_33>",
37
+ "<SPECIAL_34>",
38
+ "<SPECIAL_35>",
39
+ "<SPECIAL_36>",
40
+ "<SPECIAL_37>",
41
+ "<SPECIAL_38>",
42
+ "<SPECIAL_39>",
43
+ "<SPECIAL_40>",
44
+ "<SPECIAL_41>",
45
+ "<SPECIAL_42>",
46
+ "<SPECIAL_43>",
47
+ "<SPECIAL_44>",
48
+ "<SPECIAL_45>",
49
+ "<SPECIAL_46>",
50
+ "<SPECIAL_47>",
51
+ "<SPECIAL_48>",
52
+ "<SPECIAL_49>",
53
+ "<SPECIAL_50>",
54
+ "<SPECIAL_51>",
55
+ "<SPECIAL_52>",
56
+ "<SPECIAL_53>",
57
+ "<SPECIAL_54>",
58
+ "<SPECIAL_55>",
59
+ "<SPECIAL_56>",
60
+ "<SPECIAL_57>",
61
+ "<SPECIAL_58>",
62
+ "<SPECIAL_59>",
63
+ "<SPECIAL_60>",
64
+ "<SPECIAL_61>",
65
+ "<SPECIAL_62>",
66
+ "<SPECIAL_63>",
67
+ "<SPECIAL_64>",
68
+ "<SPECIAL_65>",
69
+ "<SPECIAL_66>",
70
+ "<SPECIAL_67>",
71
+ "<SPECIAL_68>",
72
+ "<SPECIAL_69>",
73
+ "<SPECIAL_70>",
74
+ "<SPECIAL_71>",
75
+ "<SPECIAL_72>",
76
+ "<SPECIAL_73>",
77
+ "<SPECIAL_74>",
78
+ "<SPECIAL_75>",
79
+ "<SPECIAL_76>",
80
+ "<SPECIAL_77>",
81
+ "<SPECIAL_78>",
82
+ "<SPECIAL_79>",
83
+ "<SPECIAL_80>",
84
+ "<SPECIAL_81>",
85
+ "<SPECIAL_82>",
86
+ "<SPECIAL_83>",
87
+ "<SPECIAL_84>",
88
+ "<SPECIAL_85>",
89
+ "<SPECIAL_86>",
90
+ "<SPECIAL_87>",
91
+ "<SPECIAL_88>",
92
+ "<SPECIAL_89>",
93
+ "<SPECIAL_90>",
94
+ "<SPECIAL_91>",
95
+ "<SPECIAL_92>",
96
+ "<SPECIAL_93>",
97
+ "<SPECIAL_94>",
98
+ "<SPECIAL_95>",
99
+ "<SPECIAL_96>",
100
+ "<SPECIAL_97>",
101
+ "<SPECIAL_98>",
102
+ "<SPECIAL_99>",
103
+ "<SPECIAL_100>",
104
+ "<SPECIAL_101>",
105
+ "<SPECIAL_102>",
106
+ "<SPECIAL_103>",
107
+ "<SPECIAL_104>",
108
+ "<SPECIAL_105>",
109
+ "<SPECIAL_106>",
110
+ "<SPECIAL_107>",
111
+ "<SPECIAL_108>",
112
+ "<SPECIAL_109>",
113
+ "<SPECIAL_110>",
114
+ "<SPECIAL_111>",
115
+ "<SPECIAL_112>",
116
+ "<SPECIAL_113>",
117
+ "<SPECIAL_114>",
118
+ "<SPECIAL_115>",
119
+ "<SPECIAL_116>",
120
+ "<SPECIAL_117>",
121
+ "<SPECIAL_118>",
122
+ "<SPECIAL_119>",
123
+ "<SPECIAL_120>",
124
+ "<SPECIAL_121>",
125
+ "<SPECIAL_122>",
126
+ "<SPECIAL_123>",
127
+ "<SPECIAL_124>",
128
+ "<SPECIAL_125>",
129
+ "<SPECIAL_126>",
130
+ "<SPECIAL_127>",
131
+ "<SPECIAL_128>",
132
+ "<SPECIAL_129>",
133
+ "<SPECIAL_130>",
134
+ "<SPECIAL_131>",
135
+ "<SPECIAL_132>",
136
+ "<SPECIAL_133>",
137
+ "<SPECIAL_134>",
138
+ "<SPECIAL_135>",
139
+ "<SPECIAL_136>",
140
+ "<SPECIAL_137>",
141
+ "<SPECIAL_138>",
142
+ "<SPECIAL_139>",
143
+ "<SPECIAL_140>",
144
+ "<SPECIAL_141>",
145
+ "<SPECIAL_142>",
146
+ "<SPECIAL_143>",
147
+ "<SPECIAL_144>",
148
+ "<SPECIAL_145>",
149
+ "<SPECIAL_146>",
150
+ "<SPECIAL_147>",
151
+ "<SPECIAL_148>",
152
+ "<SPECIAL_149>",
153
+ "<SPECIAL_150>",
154
+ "<SPECIAL_151>",
155
+ "<SPECIAL_152>",
156
+ "<SPECIAL_153>",
157
+ "<SPECIAL_154>",
158
+ "<SPECIAL_155>",
159
+ "<SPECIAL_156>",
160
+ "<SPECIAL_157>",
161
+ "<SPECIAL_158>",
162
+ "<SPECIAL_159>",
163
+ "<SPECIAL_160>",
164
+ "<SPECIAL_161>",
165
+ "<SPECIAL_162>",
166
+ "<SPECIAL_163>",
167
+ "<SPECIAL_164>",
168
+ "<SPECIAL_165>",
169
+ "<SPECIAL_166>",
170
+ "<SPECIAL_167>",
171
+ "<SPECIAL_168>",
172
+ "<SPECIAL_169>",
173
+ "<SPECIAL_170>",
174
+ "<SPECIAL_171>",
175
+ "<SPECIAL_172>",
176
+ "<SPECIAL_173>",
177
+ "<SPECIAL_174>",
178
+ "<SPECIAL_175>",
179
+ "<SPECIAL_176>",
180
+ "<SPECIAL_177>",
181
+ "<SPECIAL_178>",
182
+ "<SPECIAL_179>",
183
+ "<SPECIAL_180>",
184
+ "<SPECIAL_181>",
185
+ "<SPECIAL_182>",
186
+ "<SPECIAL_183>",
187
+ "<SPECIAL_184>",
188
+ "<SPECIAL_185>",
189
+ "<SPECIAL_186>",
190
+ "<SPECIAL_187>",
191
+ "<SPECIAL_188>",
192
+ "<SPECIAL_189>",
193
+ "<SPECIAL_190>",
194
+ "<SPECIAL_191>",
195
+ "<SPECIAL_192>",
196
+ "<SPECIAL_193>",
197
+ "<SPECIAL_194>",
198
+ "<SPECIAL_195>",
199
+ "<SPECIAL_196>",
200
+ "<SPECIAL_197>",
201
+ "<SPECIAL_198>",
202
+ "<SPECIAL_199>",
203
+ "<SPECIAL_200>",
204
+ "<SPECIAL_201>",
205
+ "<SPECIAL_202>",
206
+ "<SPECIAL_203>",
207
+ "<SPECIAL_204>",
208
+ "<SPECIAL_205>",
209
+ "<SPECIAL_206>",
210
+ "<SPECIAL_207>",
211
+ "<SPECIAL_208>",
212
+ "<SPECIAL_209>",
213
+ "<SPECIAL_210>",
214
+ "<SPECIAL_211>",
215
+ "<SPECIAL_212>",
216
+ "<SPECIAL_213>",
217
+ "<SPECIAL_214>",
218
+ "<SPECIAL_215>",
219
+ "<SPECIAL_216>",
220
+ "<SPECIAL_217>",
221
+ "<SPECIAL_218>",
222
+ "<SPECIAL_219>",
223
+ "<SPECIAL_220>",
224
+ "<SPECIAL_221>",
225
+ "<SPECIAL_222>",
226
+ "<SPECIAL_223>",
227
+ "<SPECIAL_224>",
228
+ "<SPECIAL_225>",
229
+ "<SPECIAL_226>",
230
+ "<SPECIAL_227>",
231
+ "<SPECIAL_228>",
232
+ "<SPECIAL_229>",
233
+ "<SPECIAL_230>",
234
+ "<SPECIAL_231>",
235
+ "<SPECIAL_232>",
236
+ "<SPECIAL_233>",
237
+ "<SPECIAL_234>",
238
+ "<SPECIAL_235>",
239
+ "<SPECIAL_236>",
240
+ "<SPECIAL_237>",
241
+ "<SPECIAL_238>",
242
+ "<SPECIAL_239>",
243
+ "<SPECIAL_240>",
244
+ "<SPECIAL_241>",
245
+ "<SPECIAL_242>",
246
+ "<SPECIAL_243>",
247
+ "<SPECIAL_244>",
248
+ "<SPECIAL_245>",
249
+ "<SPECIAL_246>",
250
+ "<SPECIAL_247>",
251
+ "<SPECIAL_248>",
252
+ "<SPECIAL_249>",
253
+ "<SPECIAL_250>",
254
+ "<SPECIAL_251>",
255
+ "<SPECIAL_252>",
256
+ "<SPECIAL_253>",
257
+ "<SPECIAL_254>",
258
+ "<SPECIAL_255>",
259
+ "<SPECIAL_256>",
260
+ "<SPECIAL_257>",
261
+ "<SPECIAL_258>",
262
+ "<SPECIAL_259>",
263
+ "<SPECIAL_260>",
264
+ "<SPECIAL_261>",
265
+ "<SPECIAL_262>",
266
+ "<SPECIAL_263>",
267
+ "<SPECIAL_264>",
268
+ "<SPECIAL_265>",
269
+ "<SPECIAL_266>",
270
+ "<SPECIAL_267>",
271
+ "<SPECIAL_268>",
272
+ "<SPECIAL_269>",
273
+ "<SPECIAL_270>",
274
+ "<SPECIAL_271>",
275
+ "<SPECIAL_272>",
276
+ "<SPECIAL_273>",
277
+ "<SPECIAL_274>",
278
+ "<SPECIAL_275>",
279
+ "<SPECIAL_276>",
280
+ "<SPECIAL_277>",
281
+ "<SPECIAL_278>",
282
+ "<SPECIAL_279>",
283
+ "<SPECIAL_280>",
284
+ "<SPECIAL_281>",
285
+ "<SPECIAL_282>",
286
+ "<SPECIAL_283>",
287
+ "<SPECIAL_284>",
288
+ "<SPECIAL_285>",
289
+ "<SPECIAL_286>",
290
+ "<SPECIAL_287>",
291
+ "<SPECIAL_288>",
292
+ "<SPECIAL_289>",
293
+ "<SPECIAL_290>",
294
+ "<SPECIAL_291>",
295
+ "<SPECIAL_292>",
296
+ "<SPECIAL_293>",
297
+ "<SPECIAL_294>",
298
+ "<SPECIAL_295>",
299
+ "<SPECIAL_296>",
300
+ "<SPECIAL_297>",
301
+ "<SPECIAL_298>",
302
+ "<SPECIAL_299>",
303
+ "<SPECIAL_300>",
304
+ "<SPECIAL_301>",
305
+ "<SPECIAL_302>",
306
+ "<SPECIAL_303>",
307
+ "<SPECIAL_304>",
308
+ "<SPECIAL_305>",
309
+ "<SPECIAL_306>",
310
+ "<SPECIAL_307>",
311
+ "<SPECIAL_308>",
312
+ "<SPECIAL_309>",
313
+ "<SPECIAL_310>",
314
+ "<SPECIAL_311>",
315
+ "<SPECIAL_312>",
316
+ "<SPECIAL_313>",
317
+ "<SPECIAL_314>",
318
+ "<SPECIAL_315>",
319
+ "<SPECIAL_316>",
320
+ "<SPECIAL_317>",
321
+ "<SPECIAL_318>",
322
+ "<SPECIAL_319>",
323
+ "<SPECIAL_320>",
324
+ "<SPECIAL_321>",
325
+ "<SPECIAL_322>",
326
+ "<SPECIAL_323>",
327
+ "<SPECIAL_324>",
328
+ "<SPECIAL_325>",
329
+ "<SPECIAL_326>",
330
+ "<SPECIAL_327>",
331
+ "<SPECIAL_328>",
332
+ "<SPECIAL_329>",
333
+ "<SPECIAL_330>",
334
+ "<SPECIAL_331>",
335
+ "<SPECIAL_332>",
336
+ "<SPECIAL_333>",
337
+ "<SPECIAL_334>",
338
+ "<SPECIAL_335>",
339
+ "<SPECIAL_336>",
340
+ "<SPECIAL_337>",
341
+ "<SPECIAL_338>",
342
+ "<SPECIAL_339>",
343
+ "<SPECIAL_340>",
344
+ "<SPECIAL_341>",
345
+ "<SPECIAL_342>",
346
+ "<SPECIAL_343>",
347
+ "<SPECIAL_344>",
348
+ "<SPECIAL_345>",
349
+ "<SPECIAL_346>",
350
+ "<SPECIAL_347>",
351
+ "<SPECIAL_348>",
352
+ "<SPECIAL_349>",
353
+ "<SPECIAL_350>",
354
+ "<SPECIAL_351>",
355
+ "<SPECIAL_352>",
356
+ "<SPECIAL_353>",
357
+ "<SPECIAL_354>",
358
+ "<SPECIAL_355>",
359
+ "<SPECIAL_356>",
360
+ "<SPECIAL_357>",
361
+ "<SPECIAL_358>",
362
+ "<SPECIAL_359>",
363
+ "<SPECIAL_360>",
364
+ "<SPECIAL_361>",
365
+ "<SPECIAL_362>",
366
+ "<SPECIAL_363>",
367
+ "<SPECIAL_364>",
368
+ "<SPECIAL_365>",
369
+ "<SPECIAL_366>",
370
+ "<SPECIAL_367>",
371
+ "<SPECIAL_368>",
372
+ "<SPECIAL_369>",
373
+ "<SPECIAL_370>",
374
+ "<SPECIAL_371>",
375
+ "<SPECIAL_372>",
376
+ "<SPECIAL_373>",
377
+ "<SPECIAL_374>",
378
+ "<SPECIAL_375>",
379
+ "<SPECIAL_376>",
380
+ "<SPECIAL_377>",
381
+ "<SPECIAL_378>",
382
+ "<SPECIAL_379>",
383
+ "<SPECIAL_380>",
384
+ "<SPECIAL_381>",
385
+ "<SPECIAL_382>",
386
+ "<SPECIAL_383>",
387
+ "<SPECIAL_384>",
388
+ "<SPECIAL_385>",
389
+ "<SPECIAL_386>",
390
+ "<SPECIAL_387>",
391
+ "<SPECIAL_388>",
392
+ "<SPECIAL_389>",
393
+ "<SPECIAL_390>",
394
+ "<SPECIAL_391>",
395
+ "<SPECIAL_392>",
396
+ "<SPECIAL_393>",
397
+ "<SPECIAL_394>",
398
+ "<SPECIAL_395>",
399
+ "<SPECIAL_396>",
400
+ "<SPECIAL_397>",
401
+ "<SPECIAL_398>",
402
+ "<SPECIAL_399>",
403
+ "<SPECIAL_400>",
404
+ "<SPECIAL_401>",
405
+ "<SPECIAL_402>",
406
+ "<SPECIAL_403>",
407
+ "<SPECIAL_404>",
408
+ "<SPECIAL_405>",
409
+ "<SPECIAL_406>",
410
+ "<SPECIAL_407>",
411
+ "<SPECIAL_408>",
412
+ "<SPECIAL_409>",
413
+ "<SPECIAL_410>",
414
+ "<SPECIAL_411>",
415
+ "<SPECIAL_412>",
416
+ "<SPECIAL_413>",
417
+ "<SPECIAL_414>",
418
+ "<SPECIAL_415>",
419
+ "<SPECIAL_416>",
420
+ "<SPECIAL_417>",
421
+ "<SPECIAL_418>",
422
+ "<SPECIAL_419>",
423
+ "<SPECIAL_420>",
424
+ "<SPECIAL_421>",
425
+ "<SPECIAL_422>",
426
+ "<SPECIAL_423>",
427
+ "<SPECIAL_424>",
428
+ "<SPECIAL_425>",
429
+ "<SPECIAL_426>",
430
+ "<SPECIAL_427>",
431
+ "<SPECIAL_428>",
432
+ "<SPECIAL_429>",
433
+ "<SPECIAL_430>",
434
+ "<SPECIAL_431>",
435
+ "<SPECIAL_432>",
436
+ "<SPECIAL_433>",
437
+ "<SPECIAL_434>",
438
+ "<SPECIAL_435>",
439
+ "<SPECIAL_436>",
440
+ "<SPECIAL_437>",
441
+ "<SPECIAL_438>",
442
+ "<SPECIAL_439>",
443
+ "<SPECIAL_440>",
444
+ "<SPECIAL_441>",
445
+ "<SPECIAL_442>",
446
+ "<SPECIAL_443>",
447
+ "<SPECIAL_444>",
448
+ "<SPECIAL_445>",
449
+ "<SPECIAL_446>",
450
+ "<SPECIAL_447>",
451
+ "<SPECIAL_448>",
452
+ "<SPECIAL_449>",
453
+ "<SPECIAL_450>",
454
+ "<SPECIAL_451>",
455
+ "<SPECIAL_452>",
456
+ "<SPECIAL_453>",
457
+ "<SPECIAL_454>",
458
+ "<SPECIAL_455>",
459
+ "<SPECIAL_456>",
460
+ "<SPECIAL_457>",
461
+ "<SPECIAL_458>",
462
+ "<SPECIAL_459>",
463
+ "<SPECIAL_460>",
464
+ "<SPECIAL_461>",
465
+ "<SPECIAL_462>",
466
+ "<SPECIAL_463>",
467
+ "<SPECIAL_464>",
468
+ "<SPECIAL_465>",
469
+ "<SPECIAL_466>",
470
+ "<SPECIAL_467>",
471
+ "<SPECIAL_468>",
472
+ "<SPECIAL_469>",
473
+ "<SPECIAL_470>",
474
+ "<SPECIAL_471>",
475
+ "<SPECIAL_472>",
476
+ "<SPECIAL_473>",
477
+ "<SPECIAL_474>",
478
+ "<SPECIAL_475>",
479
+ "<SPECIAL_476>",
480
+ "<SPECIAL_477>",
481
+ "<SPECIAL_478>",
482
+ "<SPECIAL_479>",
483
+ "<SPECIAL_480>",
484
+ "<SPECIAL_481>",
485
+ "<SPECIAL_482>",
486
+ "<SPECIAL_483>",
487
+ "<SPECIAL_484>",
488
+ "<SPECIAL_485>",
489
+ "<SPECIAL_486>",
490
+ "<SPECIAL_487>",
491
+ "<SPECIAL_488>",
492
+ "<SPECIAL_489>",
493
+ "<SPECIAL_490>",
494
+ "<SPECIAL_491>",
495
+ "<SPECIAL_492>",
496
+ "<SPECIAL_493>",
497
+ "<SPECIAL_494>",
498
+ "<SPECIAL_495>",
499
+ "<SPECIAL_496>",
500
+ "<SPECIAL_497>",
501
+ "<SPECIAL_498>",
502
+ "<SPECIAL_499>",
503
+ "<SPECIAL_500>",
504
+ "<SPECIAL_501>",
505
+ "<SPECIAL_502>",
506
+ "<SPECIAL_503>",
507
+ "<SPECIAL_504>",
508
+ "<SPECIAL_505>",
509
+ "<SPECIAL_506>",
510
+ "<SPECIAL_507>",
511
+ "<SPECIAL_508>",
512
+ "<SPECIAL_509>",
513
+ "<SPECIAL_510>",
514
+ "<SPECIAL_511>",
515
+ "<SPECIAL_512>",
516
+ "<SPECIAL_513>",
517
+ "<SPECIAL_514>",
518
+ "<SPECIAL_515>",
519
+ "<SPECIAL_516>",
520
+ "<SPECIAL_517>",
521
+ "<SPECIAL_518>",
522
+ "<SPECIAL_519>",
523
+ "<SPECIAL_520>",
524
+ "<SPECIAL_521>",
525
+ "<SPECIAL_522>",
526
+ "<SPECIAL_523>",
527
+ "<SPECIAL_524>",
528
+ "<SPECIAL_525>",
529
+ "<SPECIAL_526>",
530
+ "<SPECIAL_527>",
531
+ "<SPECIAL_528>",
532
+ "<SPECIAL_529>",
533
+ "<SPECIAL_530>",
534
+ "<SPECIAL_531>",
535
+ "<SPECIAL_532>",
536
+ "<SPECIAL_533>",
537
+ "<SPECIAL_534>",
538
+ "<SPECIAL_535>",
539
+ "<SPECIAL_536>",
540
+ "<SPECIAL_537>",
541
+ "<SPECIAL_538>",
542
+ "<SPECIAL_539>",
543
+ "<SPECIAL_540>",
544
+ "<SPECIAL_541>",
545
+ "<SPECIAL_542>",
546
+ "<SPECIAL_543>",
547
+ "<SPECIAL_544>",
548
+ "<SPECIAL_545>",
549
+ "<SPECIAL_546>",
550
+ "<SPECIAL_547>",
551
+ "<SPECIAL_548>",
552
+ "<SPECIAL_549>",
553
+ "<SPECIAL_550>",
554
+ "<SPECIAL_551>",
555
+ "<SPECIAL_552>",
556
+ "<SPECIAL_553>",
557
+ "<SPECIAL_554>",
558
+ "<SPECIAL_555>",
559
+ "<SPECIAL_556>",
560
+ "<SPECIAL_557>",
561
+ "<SPECIAL_558>",
562
+ "<SPECIAL_559>",
563
+ "<SPECIAL_560>",
564
+ "<SPECIAL_561>",
565
+ "<SPECIAL_562>",
566
+ "<SPECIAL_563>",
567
+ "<SPECIAL_564>",
568
+ "<SPECIAL_565>",
569
+ "<SPECIAL_566>",
570
+ "<SPECIAL_567>",
571
+ "<SPECIAL_568>",
572
+ "<SPECIAL_569>",
573
+ "<SPECIAL_570>",
574
+ "<SPECIAL_571>",
575
+ "<SPECIAL_572>",
576
+ "<SPECIAL_573>",
577
+ "<SPECIAL_574>",
578
+ "<SPECIAL_575>",
579
+ "<SPECIAL_576>",
580
+ "<SPECIAL_577>",
581
+ "<SPECIAL_578>",
582
+ "<SPECIAL_579>",
583
+ "<SPECIAL_580>",
584
+ "<SPECIAL_581>",
585
+ "<SPECIAL_582>",
586
+ "<SPECIAL_583>",
587
+ "<SPECIAL_584>",
588
+ "<SPECIAL_585>",
589
+ "<SPECIAL_586>",
590
+ "<SPECIAL_587>",
591
+ "<SPECIAL_588>",
592
+ "<SPECIAL_589>",
593
+ "<SPECIAL_590>",
594
+ "<SPECIAL_591>",
595
+ "<SPECIAL_592>",
596
+ "<SPECIAL_593>",
597
+ "<SPECIAL_594>",
598
+ "<SPECIAL_595>",
599
+ "<SPECIAL_596>",
600
+ "<SPECIAL_597>",
601
+ "<SPECIAL_598>",
602
+ "<SPECIAL_599>",
603
+ "<SPECIAL_600>",
604
+ "<SPECIAL_601>",
605
+ "<SPECIAL_602>",
606
+ "<SPECIAL_603>",
607
+ "<SPECIAL_604>",
608
+ "<SPECIAL_605>",
609
+ "<SPECIAL_606>",
610
+ "<SPECIAL_607>",
611
+ "<SPECIAL_608>",
612
+ "<SPECIAL_609>",
613
+ "<SPECIAL_610>",
614
+ "<SPECIAL_611>",
615
+ "<SPECIAL_612>",
616
+ "<SPECIAL_613>",
617
+ "<SPECIAL_614>",
618
+ "<SPECIAL_615>",
619
+ "<SPECIAL_616>",
620
+ "<SPECIAL_617>",
621
+ "<SPECIAL_618>",
622
+ "<SPECIAL_619>",
623
+ "<SPECIAL_620>",
624
+ "<SPECIAL_621>",
625
+ "<SPECIAL_622>",
626
+ "<SPECIAL_623>",
627
+ "<SPECIAL_624>",
628
+ "<SPECIAL_625>",
629
+ "<SPECIAL_626>",
630
+ "<SPECIAL_627>",
631
+ "<SPECIAL_628>",
632
+ "<SPECIAL_629>",
633
+ "<SPECIAL_630>",
634
+ "<SPECIAL_631>",
635
+ "<SPECIAL_632>",
636
+ "<SPECIAL_633>",
637
+ "<SPECIAL_634>",
638
+ "<SPECIAL_635>",
639
+ "<SPECIAL_636>",
640
+ "<SPECIAL_637>",
641
+ "<SPECIAL_638>",
642
+ "<SPECIAL_639>",
643
+ "<SPECIAL_640>",
644
+ "<SPECIAL_641>",
645
+ "<SPECIAL_642>",
646
+ "<SPECIAL_643>",
647
+ "<SPECIAL_644>",
648
+ "<SPECIAL_645>",
649
+ "<SPECIAL_646>",
650
+ "<SPECIAL_647>",
651
+ "<SPECIAL_648>",
652
+ "<SPECIAL_649>",
653
+ "<SPECIAL_650>",
654
+ "<SPECIAL_651>",
655
+ "<SPECIAL_652>",
656
+ "<SPECIAL_653>",
657
+ "<SPECIAL_654>",
658
+ "<SPECIAL_655>",
659
+ "<SPECIAL_656>",
660
+ "<SPECIAL_657>",
661
+ "<SPECIAL_658>",
662
+ "<SPECIAL_659>",
663
+ "<SPECIAL_660>",
664
+ "<SPECIAL_661>",
665
+ "<SPECIAL_662>",
666
+ "<SPECIAL_663>",
667
+ "<SPECIAL_664>",
668
+ "<SPECIAL_665>",
669
+ "<SPECIAL_666>",
670
+ "<SPECIAL_667>",
671
+ "<SPECIAL_668>",
672
+ "<SPECIAL_669>",
673
+ "<SPECIAL_670>",
674
+ "<SPECIAL_671>",
675
+ "<SPECIAL_672>",
676
+ "<SPECIAL_673>",
677
+ "<SPECIAL_674>",
678
+ "<SPECIAL_675>",
679
+ "<SPECIAL_676>",
680
+ "<SPECIAL_677>",
681
+ "<SPECIAL_678>",
682
+ "<SPECIAL_679>",
683
+ "<SPECIAL_680>",
684
+ "<SPECIAL_681>",
685
+ "<SPECIAL_682>",
686
+ "<SPECIAL_683>",
687
+ "<SPECIAL_684>",
688
+ "<SPECIAL_685>",
689
+ "<SPECIAL_686>",
690
+ "<SPECIAL_687>",
691
+ "<SPECIAL_688>",
692
+ "<SPECIAL_689>",
693
+ "<SPECIAL_690>",
694
+ "<SPECIAL_691>",
695
+ "<SPECIAL_692>",
696
+ "<SPECIAL_693>",
697
+ "<SPECIAL_694>",
698
+ "<SPECIAL_695>",
699
+ "<SPECIAL_696>",
700
+ "<SPECIAL_697>",
701
+ "<SPECIAL_698>",
702
+ "<SPECIAL_699>",
703
+ "<SPECIAL_700>",
704
+ "<SPECIAL_701>",
705
+ "<SPECIAL_702>",
706
+ "<SPECIAL_703>",
707
+ "<SPECIAL_704>",
708
+ "<SPECIAL_705>",
709
+ "<SPECIAL_706>",
710
+ "<SPECIAL_707>",
711
+ "<SPECIAL_708>",
712
+ "<SPECIAL_709>",
713
+ "<SPECIAL_710>",
714
+ "<SPECIAL_711>",
715
+ "<SPECIAL_712>",
716
+ "<SPECIAL_713>",
717
+ "<SPECIAL_714>",
718
+ "<SPECIAL_715>",
719
+ "<SPECIAL_716>",
720
+ "<SPECIAL_717>",
721
+ "<SPECIAL_718>",
722
+ "<SPECIAL_719>",
723
+ "<SPECIAL_720>",
724
+ "<SPECIAL_721>",
725
+ "<SPECIAL_722>",
726
+ "<SPECIAL_723>",
727
+ "<SPECIAL_724>",
728
+ "<SPECIAL_725>",
729
+ "<SPECIAL_726>",
730
+ "<SPECIAL_727>",
731
+ "<SPECIAL_728>",
732
+ "<SPECIAL_729>",
733
+ "<SPECIAL_730>",
734
+ "<SPECIAL_731>",
735
+ "<SPECIAL_732>",
736
+ "<SPECIAL_733>",
737
+ "<SPECIAL_734>",
738
+ "<SPECIAL_735>",
739
+ "<SPECIAL_736>",
740
+ "<SPECIAL_737>",
741
+ "<SPECIAL_738>",
742
+ "<SPECIAL_739>",
743
+ "<SPECIAL_740>",
744
+ "<SPECIAL_741>",
745
+ "<SPECIAL_742>",
746
+ "<SPECIAL_743>",
747
+ "<SPECIAL_744>",
748
+ "<SPECIAL_745>",
749
+ "<SPECIAL_746>",
750
+ "<SPECIAL_747>",
751
+ "<SPECIAL_748>",
752
+ "<SPECIAL_749>",
753
+ "<SPECIAL_750>",
754
+ "<SPECIAL_751>",
755
+ "<SPECIAL_752>",
756
+ "<SPECIAL_753>",
757
+ "<SPECIAL_754>",
758
+ "<SPECIAL_755>",
759
+ "<SPECIAL_756>",
760
+ "<SPECIAL_757>",
761
+ "<SPECIAL_758>",
762
+ "<SPECIAL_759>",
763
+ "<SPECIAL_760>",
764
+ "<SPECIAL_761>",
765
+ "<SPECIAL_762>",
766
+ "<SPECIAL_763>",
767
+ "<SPECIAL_764>",
768
+ "<SPECIAL_765>",
769
+ "<SPECIAL_766>",
770
+ "<SPECIAL_767>",
771
+ "<SPECIAL_768>",
772
+ "<SPECIAL_769>",
773
+ "<SPECIAL_770>",
774
+ "<SPECIAL_771>",
775
+ "<SPECIAL_772>",
776
+ "<SPECIAL_773>",
777
+ "<SPECIAL_774>",
778
+ "<SPECIAL_775>",
779
+ "<SPECIAL_776>",
780
+ "<SPECIAL_777>",
781
+ "<SPECIAL_778>",
782
+ "<SPECIAL_779>",
783
+ "<SPECIAL_780>",
784
+ "<SPECIAL_781>",
785
+ "<SPECIAL_782>",
786
+ "<SPECIAL_783>",
787
+ "<SPECIAL_784>",
788
+ "<SPECIAL_785>",
789
+ "<SPECIAL_786>",
790
+ "<SPECIAL_787>",
791
+ "<SPECIAL_788>",
792
+ "<SPECIAL_789>",
793
+ "<SPECIAL_790>",
794
+ "<SPECIAL_791>",
795
+ "<SPECIAL_792>",
796
+ "<SPECIAL_793>",
797
+ "<SPECIAL_794>",
798
+ "<SPECIAL_795>",
799
+ "<SPECIAL_796>",
800
+ "<SPECIAL_797>",
801
+ "<SPECIAL_798>",
802
+ "<SPECIAL_799>",
803
+ "<SPECIAL_800>",
804
+ "<SPECIAL_801>",
805
+ "<SPECIAL_802>",
806
+ "<SPECIAL_803>",
807
+ "<SPECIAL_804>",
808
+ "<SPECIAL_805>",
809
+ "<SPECIAL_806>",
810
+ "<SPECIAL_807>",
811
+ "<SPECIAL_808>",
812
+ "<SPECIAL_809>",
813
+ "<SPECIAL_810>",
814
+ "<SPECIAL_811>",
815
+ "<SPECIAL_812>",
816
+ "<SPECIAL_813>",
817
+ "<SPECIAL_814>",
818
+ "<SPECIAL_815>",
819
+ "<SPECIAL_816>",
820
+ "<SPECIAL_817>",
821
+ "<SPECIAL_818>",
822
+ "<SPECIAL_819>",
823
+ "<SPECIAL_820>",
824
+ "<SPECIAL_821>",
825
+ "<SPECIAL_822>",
826
+ "<SPECIAL_823>",
827
+ "<SPECIAL_824>",
828
+ "<SPECIAL_825>",
829
+ "<SPECIAL_826>",
830
+ "<SPECIAL_827>",
831
+ "<SPECIAL_828>",
832
+ "<SPECIAL_829>",
833
+ "<SPECIAL_830>",
834
+ "<SPECIAL_831>",
835
+ "<SPECIAL_832>",
836
+ "<SPECIAL_833>",
837
+ "<SPECIAL_834>",
838
+ "<SPECIAL_835>",
839
+ "<SPECIAL_836>",
840
+ "<SPECIAL_837>",
841
+ "<SPECIAL_838>",
842
+ "<SPECIAL_839>",
843
+ "<SPECIAL_840>",
844
+ "<SPECIAL_841>",
845
+ "<SPECIAL_842>",
846
+ "<SPECIAL_843>",
847
+ "<SPECIAL_844>",
848
+ "<SPECIAL_845>",
849
+ "<SPECIAL_846>",
850
+ "<SPECIAL_847>",
851
+ "<SPECIAL_848>",
852
+ "<SPECIAL_849>",
853
+ "<SPECIAL_850>",
854
+ "<SPECIAL_851>",
855
+ "<SPECIAL_852>",
856
+ "<SPECIAL_853>",
857
+ "<SPECIAL_854>",
858
+ "<SPECIAL_855>",
859
+ "<SPECIAL_856>",
860
+ "<SPECIAL_857>",
861
+ "<SPECIAL_858>",
862
+ "<SPECIAL_859>",
863
+ "<SPECIAL_860>",
864
+ "<SPECIAL_861>",
865
+ "<SPECIAL_862>",
866
+ "<SPECIAL_863>",
867
+ "<SPECIAL_864>",
868
+ "<SPECIAL_865>",
869
+ "<SPECIAL_866>",
870
+ "<SPECIAL_867>",
871
+ "<SPECIAL_868>",
872
+ "<SPECIAL_869>",
873
+ "<SPECIAL_870>",
874
+ "<SPECIAL_871>",
875
+ "<SPECIAL_872>",
876
+ "<SPECIAL_873>",
877
+ "<SPECIAL_874>",
878
+ "<SPECIAL_875>",
879
+ "<SPECIAL_876>",
880
+ "<SPECIAL_877>",
881
+ "<SPECIAL_878>",
882
+ "<SPECIAL_879>",
883
+ "<SPECIAL_880>",
884
+ "<SPECIAL_881>",
885
+ "<SPECIAL_882>",
886
+ "<SPECIAL_883>",
887
+ "<SPECIAL_884>",
888
+ "<SPECIAL_885>",
889
+ "<SPECIAL_886>",
890
+ "<SPECIAL_887>",
891
+ "<SPECIAL_888>",
892
+ "<SPECIAL_889>",
893
+ "<SPECIAL_890>",
894
+ "<SPECIAL_891>",
895
+ "<SPECIAL_892>",
896
+ "<SPECIAL_893>",
897
+ "<SPECIAL_894>",
898
+ "<SPECIAL_895>",
899
+ "<SPECIAL_896>",
900
+ "<SPECIAL_897>",
901
+ "<SPECIAL_898>",
902
+ "<SPECIAL_899>",
903
+ "<SPECIAL_900>",
904
+ "<SPECIAL_901>",
905
+ "<SPECIAL_902>",
906
+ "<SPECIAL_903>",
907
+ "<SPECIAL_904>",
908
+ "<SPECIAL_905>",
909
+ "<SPECIAL_906>",
910
+ "<SPECIAL_907>",
911
+ "<SPECIAL_908>",
912
+ "<SPECIAL_909>",
913
+ "<SPECIAL_910>",
914
+ "<SPECIAL_911>",
915
+ "<SPECIAL_912>",
916
+ "<SPECIAL_913>",
917
+ "<SPECIAL_914>",
918
+ "<SPECIAL_915>",
919
+ "<SPECIAL_916>",
920
+ "<SPECIAL_917>",
921
+ "<SPECIAL_918>",
922
+ "<SPECIAL_919>",
923
+ "<SPECIAL_920>",
924
+ "<SPECIAL_921>",
925
+ "<SPECIAL_922>",
926
+ "<SPECIAL_923>",
927
+ "<SPECIAL_924>",
928
+ "<SPECIAL_925>",
929
+ "<SPECIAL_926>",
930
+ "<SPECIAL_927>",
931
+ "<SPECIAL_928>",
932
+ "<SPECIAL_929>",
933
+ "<SPECIAL_930>",
934
+ "<SPECIAL_931>",
935
+ "<SPECIAL_932>",
936
+ "<SPECIAL_933>",
937
+ "<SPECIAL_934>",
938
+ "<SPECIAL_935>",
939
+ "<SPECIAL_936>",
940
+ "<SPECIAL_937>",
941
+ "<SPECIAL_938>",
942
+ "<SPECIAL_939>",
943
+ "<SPECIAL_940>",
944
+ "<SPECIAL_941>",
945
+ "<SPECIAL_942>",
946
+ "<SPECIAL_943>",
947
+ "<SPECIAL_944>",
948
+ "<SPECIAL_945>",
949
+ "<SPECIAL_946>",
950
+ "<SPECIAL_947>",
951
+ "<SPECIAL_948>",
952
+ "<SPECIAL_949>",
953
+ "<SPECIAL_950>",
954
+ "<SPECIAL_951>",
955
+ "<SPECIAL_952>",
956
+ "<SPECIAL_953>",
957
+ "<SPECIAL_954>",
958
+ "<SPECIAL_955>",
959
+ "<SPECIAL_956>",
960
+ "<SPECIAL_957>",
961
+ "<SPECIAL_958>",
962
+ "<SPECIAL_959>",
963
+ "<SPECIAL_960>",
964
+ "<SPECIAL_961>",
965
+ "<SPECIAL_962>",
966
+ "<SPECIAL_963>",
967
+ "<SPECIAL_964>",
968
+ "<SPECIAL_965>",
969
+ "<SPECIAL_966>",
970
+ "<SPECIAL_967>",
971
+ "<SPECIAL_968>",
972
+ "<SPECIAL_969>",
973
+ "<SPECIAL_970>",
974
+ "<SPECIAL_971>",
975
+ "<SPECIAL_972>",
976
+ "<SPECIAL_973>",
977
+ "<SPECIAL_974>",
978
+ "<SPECIAL_975>",
979
+ "<SPECIAL_976>",
980
+ "<SPECIAL_977>",
981
+ "<SPECIAL_978>",
982
+ "<SPECIAL_979>",
983
+ "<SPECIAL_980>",
984
+ "<SPECIAL_981>",
985
+ "<SPECIAL_982>",
986
+ "<SPECIAL_983>",
987
+ "<SPECIAL_984>",
988
+ "<SPECIAL_985>",
989
+ "<SPECIAL_986>",
990
+ "<SPECIAL_987>",
991
+ "<SPECIAL_988>",
992
+ "<SPECIAL_989>",
993
+ "<SPECIAL_990>",
994
+ "<SPECIAL_991>",
995
+ "<SPECIAL_992>",
996
+ "<SPECIAL_993>",
997
+ "<SPECIAL_994>",
998
+ "<SPECIAL_995>",
999
+ "<SPECIAL_996>",
1000
+ "<SPECIAL_997>",
1001
+ "<SPECIAL_998>",
1002
+ "<SPECIAL_999>"
1003
+ ],
1004
+ "bos_token": {
1005
+ "content": "<s>",
1006
+ "lstrip": false,
1007
+ "normalized": false,
1008
+ "rstrip": false,
1009
+ "single_word": false
1010
+ },
1011
+ "eos_token": {
1012
+ "content": "</s>",
1013
+ "lstrip": false,
1014
+ "normalized": false,
1015
+ "rstrip": false,
1016
+ "single_word": false
1017
+ },
1018
+ "pad_token": {
1019
+ "content": "<pad>",
1020
+ "lstrip": false,
1021
+ "normalized": false,
1022
+ "rstrip": false,
1023
+ "single_word": false
1024
+ },
1025
+ "unk_token": {
1026
+ "content": "<unk>",
1027
+ "lstrip": false,
1028
+ "normalized": false,
1029
+ "rstrip": false,
1030
+ "single_word": false
1031
+ }
1032
+ }
tekken.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c604f35d1035f534519622c0ec83fed6184978d4fdee92a5bd2a50bc05438094
3
+ size 14801330
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b76085f9923309d873994d444989f7eb6ec074b06f25b58f1e8d7b7741070949
3
+ size 17078037
tokenizer_config.json ADDED
The diff for this file is too large to render. See raw diff