nielsr HF Staff commited on
Commit
2980473
·
verified ·
1 Parent(s): 95d11f4

Add pipeline tag and library name

Browse files

This PR adds the `pipeline_tag` and `library_name` to the model card metadata to improve discoverability on the Hugging Face Hub. The `pipeline_tag` is set to `text-generation` as this model is a language model. The `library_name` is set to `transformers` as it utilizes the Hugging Face Transformers library.

Files changed (1) hide show
  1. README.md +489 -1
README.md CHANGED
@@ -1,7 +1,9 @@
1
  ---
2
- license: apache-2.0
3
  base_model:
4
  - Qwen/Qwen2.5-Math-7B
 
 
 
5
  ---
6
 
7
  ## Model ID
@@ -18,3 +20,489 @@ The RL model (GPG-7B in paper) trained on the simple1r_qwen_level3to5 dataset ba
18
 
19
  Due to changes in environment and devices, test results may fluctuate. Specifically, when tested on an NPU, the average accuracy of five datasets (AIME24, AMC23, MATH-500, Minerva and OlympiadBench) is 57.7. However, when tested on an H20 GPU, the average accuracy drops from 57.7 to 55.3. These fluctuations are entirely within an acceptable range.
20
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
2
  base_model:
3
  - Qwen/Qwen2.5-Math-7B
4
+ license: apache-2.0
5
+ pipeline_tag: text-generation
6
+ library_name: transformers
7
  ---
8
 
9
  ## Model ID
 
20
 
21
  Due to changes in environment and devices, test results may fluctuate. Specifically, when tested on an NPU, the average accuracy of five datasets (AIME24, AMC23, MATH-500, Minerva and OlympiadBench) is 57.7. However, when tested on an H20 GPU, the average accuracy drops from 57.7 to 55.3. These fluctuations are entirely within an acceptable range.
22
 
23
+ # File information
24
+
25
+ The repository contains the following file information:
26
+
27
+ Filename: tokenizer_config.json
28
+ Content: {
29
+ "add_bos_token": false,
30
+ "add_prefix_space": false,
31
+ "added_tokens_decoder": {
32
+ "151643": {
33
+ "content": "<|endoftext|>",
34
+ "lstrip": false,
35
+ "normalized": false,
36
+ "rstrip": false,
37
+ "single_word": false,
38
+ "special": true
39
+ },
40
+ "151644": {
41
+ "content": "<|im_start|>",
42
+ "lstrip": false,
43
+ "normalized": false,
44
+ "rstrip": false,
45
+ "single_word": false,
46
+ "special": true
47
+ },
48
+ "151645": {
49
+ "content": "<|im_end|>",
50
+ "lstrip": false,
51
+ "normalized": false,
52
+ "rstrip": false,
53
+ "single_word": false,
54
+ "special": true
55
+ },
56
+ "151646": {
57
+ "content": "<|object_ref_start|>",
58
+ "lstrip": false,
59
+ "normalized": false,
60
+ "rstrip": false,
61
+ "single_word": false,
62
+ "special": true
63
+ },
64
+ "151647": {
65
+ "content": "<|object_ref_end|>",
66
+ "lstrip": false,
67
+ "normalized": false,
68
+ "rstrip": false,
69
+ "single_word": false,
70
+ "special": true
71
+ },
72
+ "151648": {
73
+ "content": "<|box_start|>",
74
+ "lstrip": false,
75
+ "normalized": false,
76
+ "rstrip": false,
77
+ "single_word": false,
78
+ "special": true
79
+ },
80
+ "151649": {
81
+ "content": "<|box_end|>",
82
+ "lstrip": false,
83
+ "normalized": false,
84
+ "rstrip": false,
85
+ "single_word": false,
86
+ "special": true
87
+ },
88
+ "151650": {
89
+ "content": "<|quad_start|>",
90
+ "lstrip": false,
91
+ "normalized": false,
92
+ "rstrip": false,
93
+ "single_word": false,
94
+ "special": true
95
+ },
96
+ "151651": {
97
+ "content": "<|quad_end|>",
98
+ "lstrip": false,
99
+ "normalized": false,
100
+ "rstrip": false,
101
+ "single_word": false,
102
+ "special": true
103
+ },
104
+ "151652": {
105
+ "content": "<|vision_start|>",
106
+ "lstrip": false,
107
+ "normalized": false,
108
+ "rstrip": false,
109
+ "single_word": false,
110
+ "special": true
111
+ },
112
+ "151653": {
113
+ "content": "<|vision_end|>",
114
+ "lstrip": false,
115
+ "normalized": false,
116
+ "rstrip": false,
117
+ "single_word": false,
118
+ "special": true
119
+ },
120
+ "151654": {
121
+ "content": "<|vision_pad|>",
122
+ "lstrip": false,
123
+ "normalized": false,
124
+ "rstrip": false,
125
+ "single_word": false,
126
+ "special": true
127
+ },
128
+ "151655": {
129
+ "content": "<|image_pad|>",
130
+ "lstrip": false,
131
+ "normalized": false,
132
+ "rstrip": false,
133
+ "single_word": false,
134
+ "special": true
135
+ },
136
+ "151656": {
137
+ "content": "<|video_pad|>",
138
+ "lstrip": false,
139
+ "normalized": false,
140
+ "rstrip": false,
141
+ "single_word": false,
142
+ "special": true
143
+ },
144
+ "151657": {
145
+ "content": "<tool_call>",
146
+ "lstrip": false,
147
+ "normalized": false,
148
+ "rstrip": false,
149
+ "single_word": false,
150
+ "special": false
151
+ },
152
+ "151658": {
153
+ "content": "</tool_call>",
154
+ "lstrip": false,
155
+ "normalized": false,
156
+ "rstrip": false,
157
+ "single_word": false,
158
+ "special": false
159
+ },
160
+ "151659": {
161
+ "content": "<|fim_prefix|>",
162
+ "lstrip": false,
163
+ "normalized": false,
164
+ "rstrip": false,
165
+ "single_word": false,
166
+ "special": false
167
+ },
168
+ "151660": {
169
+ "content": "<|fim_middle|>",
170
+ "lstrip": false,
171
+ "normalized": false,
172
+ "rstrip": false,
173
+ "single_word": false,
174
+ "special": false
175
+ },
176
+ "151661": {
177
+ "content": "<|fim_suffix|>",
178
+ "lstrip": false,
179
+ "normalized": false,
180
+ "rstrip": false,
181
+ "single_word": false,
182
+ "special": false
183
+ },
184
+ "151662": {
185
+ "content": "<|fim_pad|>",
186
+ "lstrip": false,
187
+ "normalized": false,
188
+ "rstrip": false,
189
+ "single_word": false,
190
+ "special": false
191
+ },
192
+ "151663": {
193
+ "content": "<|repo_name|>",
194
+ "lstrip": false,
195
+ "normalized": false,
196
+ "rstrip": false,
197
+ "single_word": false,
198
+ "special": false
199
+ },
200
+ "151664": {
201
+ "content": "<|file_sep|>",
202
+ "lstrip": false,
203
+ "normalized": false,
204
+ "rstrip": false,
205
+ "single_word": false,
206
+ "special": false
207
+ }
208
+ },
209
+ "additional_special_tokens": [
210
+ "<|im_start|>",
211
+ "<|im_end|>",
212
+ "<|object_ref_start|>",
213
+ "<|object_ref_end|>",
214
+ "<|box_start|>",
215
+ "<|box_end|>",
216
+ "<|quad_start|>",
217
+ "<|quad_end|>",
218
+ "<|vision_start|>",
219
+ "<|vision_end|>",
220
+ "<|vision_pad|>",
221
+ "<|image_pad|>",
222
+ "<|video_pad|>"
223
+ ],
224
+ "bos_token": null,
225
+ "chat_template": "{%- if tools %}
226
+ {{- '<|im_start|>system\
227
+ ' }}
228
+ {%- if messages[0]['role'] == 'system' %}
229
+ {{- messages[0]['content'] }}
230
+ {%- else %}
231
+ {{- 'Please reason step by step, and put your final answer within \\\\boxed{}.' }}
232
+ {%- endif %}
233
+ {{- \"\
234
+ \
235
+ # Tools\
236
+ \
237
+ You may call one or more functions to assist with the user query.\
238
+ \
239
+ You are provided with function signatures within <tools></tools> XML tags:\
240
+ <tools>\" }}
241
+ {%- for tool in tools %}
242
+ {{- \"\
243
+ \" }}
244
+ {{- tool | tojson }}
245
+ {%- endfor %}
246
+ {{- \"\
247
+ </tools>\
248
+ \
249
+ For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\
250
+ <tool_call>\
251
+ {\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\
252
+ </tool_call><|im_end|>\
253
+ \" }}
254
+ {%- else %}
255
+ {%- if messages[0]['role'] == 'system' %}
256
+ {{- '<|im_start|>system\
257
+ ' + messages[0]['content'] + '<|im_end|>\
258
+ ' }}
259
+ {%- else %}
260
+ {{- '<|im_start|>system\
261
+ Please reason step by step, and put your final answer within \\\\boxed{}.<|im_end|>\
262
+ ' }}
263
+ {%- endif %}
264
+ {%- endif %}
265
+ {%- for message in messages %}
266
+ {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) or (message.role == \"assistant\" and not message.tool_calls) %}
267
+ {{- '<|im_start|>' + message.role + '\
268
+ ' + message.content + '<|im_end|>' + '\
269
+ ' }}
270
+ {%- elif message.role == \"assistant\" %}
271
+ {{- '<|im_start|>' + message.role }}
272
+ {%- if message.content %}
273
+ {{- '\
274
+ ' + message.content }}
275
+ {%- endif %}
276
+ {%- for tool_call in message.tool_calls %}
277
+ {%- if tool_call.function is defined %}
278
+ {%- set tool_call = tool_call.function %}
279
+ {%- endif %}
280
+ {{- '\
281
+ <tool_call>\
282
+ {\\\"name\\\": \"' }}
283
+ {{- tool_call.name }}
284
+ {{- '\", \\\"arguments\\\": ' }}
285
+ {{- tool_call.arguments | tojson }}
286
+ {{- '}\
287
+ </tool_call>' }}
288
+ {%- endfor %}
289
+ {{- '<|im_end|>\
290
+ ' }}
291
+ {%- elif message.role == \"tool\" %}
292
+ {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != \"tool\") %}
293
+ {{- '<|im_start|>user' }}
294
+ {%- endif %}
295
+ {{- '\
296
+ <tool_response>\
297
+ ' }}
298
+ {{- message.content }}
299
+ {{- '\
300
+ </tool_response>' }}
301
+ {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}
302
+ {{- '<|im_end|>\
303
+ ' }}
304
+ {%- endif %}
305
+ {%- endif %}
306
+ {%- endfor %}
307
+ {%- if add_generation_prompt %}
308
+ {{- '<|im_start|>assistant\
309
+ ' }}
310
+ {%- endif %}
311
+ ",
312
+ "clean_up_tokenization_spaces": false,
313
+ "eos_token": "<|endoftext|>",
314
+ "errors": "replace",
315
+ "extra_special_tokens": {},
316
+ "model_max_length": 131072,
317
+ "pad_token": "<|endoftext|>",
318
+ "split_special_tokens": false,
319
+ "tokenizer_class": "Qwen2Tokenizer",
320
+ "unk_token": null
321
+ }
322
+
323
+ Filename: generation_config.json
324
+ Content: {
325
+ "_from_model_config": true,
326
+ "bos_token_id": 151643,
327
+ "eos_token_id": 151643,
328
+ "transformers_version": "4.49.0"
329
+ }
330
+
331
+ Filename: vocab.json
332
+ Content: "Content of the file is larger than 50 KB, too long to display."
333
+
334
+ Filename: added_tokens.json
335
+ Content: {
336
+ "</tool_call>": 151658,
337
+ "<tool_call>": 151657,
338
+ "<|box_end|>": 151649,
339
+ "<|box_start|>": 151648,
340
+ "<|endoftext|>": 151643,
341
+ "<|file_sep|>": 151664,
342
+ "<|fim_middle|>": 151660,
343
+ "<|fim_pad|>": 151662,
344
+ "<|fim_prefix|>": 151659,
345
+ "<|fim_suffix|>": 151661,
346
+ "<|im_end|>": 151645,
347
+ "<|im_start|>": 151644,
348
+ "<|image_pad|>": 151655,
349
+ "<|object_ref_end|>": 151647,
350
+ "<|object_ref_start|>": 151646,
351
+ "<|quad_end|>": 151651,
352
+ "<|quad_start|>": 151650,
353
+ "<|repo_name|>": 151663,
354
+ "<|video_pad|>": 151656,
355
+ "<|vision_end|>": 151653,
356
+ "<|vision_pad|>": 151654,
357
+ "<|vision_start|>": 151652
358
+ }
359
+
360
+ Filename: config.json
361
+ Content: {
362
+ "_name_or_path": "/mnt/workspace/common/models/Qwen2.5-Math-7B",
363
+ "architectures": [
364
+ "Qwen2ForCausalLM"
365
+ ],
366
+ "attention_dropout": 0.0,
367
+ "bos_token_id": 151643,
368
+ "eos_token_id": 151643,
369
+ "hidden_act": "silu",
370
+ "hidden_size": 3584,
371
+ "initializer_range": 0.02,
372
+ "intermediate_size": 18944,
373
+ "max_position_embeddings": 4096,
374
+ "max_window_layers": 28,
375
+ "model_type": "qwen2",
376
+ "num_attention_heads": 28,
377
+ "num_hidden_layers": 28,
378
+ "num_key_value_heads": 4,
379
+ "rms_norm_eps": 1e-06,
380
+ "rope_scaling": null,
381
+ "rope_theta": 10000,
382
+ "sliding_window": 4096,
383
+ "tie_word_embeddings": false,
384
+ "torch_dtype": "bfloat16",
385
+ "transformers_version": "4.49.0",
386
+ "use_cache": true,
387
+ "use_mrope": false,
388
+ "use_sliding_window": false,
389
+ "vocab_size": 152064
390
+ }
391
+
392
+ Filename: tokenizer.json
393
+ Content: "Content of the file is larger than 50 KB, too long to display."
394
+
395
+ Filename: model.safetensors.index.json
396
+ Content: {
397
+ "metadata": {
398
+ "total_size": 15231233024
399
+ },
400
+ "weight_map": {
401
+ "lm_head.weight": "model-00002-of-00004.safetensors",
402
+ "model.embed_tokens.weight": "model-00004-of-00004.safetensors",
403
+ "model.layers.0.input_layernorm.weight": "model-00002-of-00004.safetensors",
404
+ "model.layers.0.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
405
+ "model.layers.0.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
406
+ "model.layers.0.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
407
+ "model.layers.0.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
408
+ "model.layers.0.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
409
+ "model.layers.0.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
410
+ "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
411
+ "model.layers.0.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
412
+ "model.layers.0.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
413
+ "model.layers.0.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
414
+ "model.layers.0.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
415
+ "model.layers.1.input_layernorm.weight": "model-00001-of-00004.safetensors",
416
+ "model.layers.1.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
417
+ "model.layers.1.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
418
+ "model.layers.1.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
419
+ "model.layers.1.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
420
+ "model.layers.1.self_attn.k_proj.bias": "model-00001-of-00004.safetensors",
421
+ "model.layers.1.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
422
+ "model.layers.1.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
423
+ "model.layers.1.self_attn.q_proj.bias": "model-00003-of-00004.safetensors",
424
+ "model.layers.1.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
425
+ "model.layers.1.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
426
+ "model.layers.1.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
427
+ "model.layers.10.input_layernorm.weight": "model-00002-of-00004.safetensors",
428
+ "model.layers.10.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
429
+ "model.layers.10.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
430
+ "model.layers.10.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
431
+ "model.layers.10.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
432
+ "model.layers.10.self_attn.k_proj.bias": "model-00001-of-00004.safetensors",
433
+ "model.layers.10.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
434
+ "model.layers.10.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
435
+ "model.layers.10.self_attn.q_proj.bias": "model-00001-of-00004.safetensors",
436
+ "model.layers.10.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
437
+ "model.layers.10.self_attn.v_proj.bias": "model-00001-of-00004.safetensors",
438
+ "model.layers.10.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
439
+ "model.layers.11.input_layernorm.weight": "model-00002-of-00004.safetensors",
440
+ "model.layers.11.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
441
+ "model.layers.11.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
442
+ "model.layers.11.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
443
+ "model.layers.11.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
444
+ "model.layers.11.self_attn.k_proj.bias": "model-00002-of-00004.safetensors",
445
+ "model.layers.11.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
446
+ "model.layers.11.self_attn.o_proj.weight": "model-00001-of-00004.safetensors",
447
+ "model.layers.11.self_attn.q_proj.bias": "model-00001-of-00004.safetensors",
448
+ "model.layers.11.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
449
+ "model.layers.11.self_attn.v_proj.bias": "model-00001-of-00004.safetensors",
450
+ "model.layers.11.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
451
+ "model.layers.12.input_layernorm.weight": "model-00003-of-00004.safetensors",
452
+ "model.layers.12.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
453
+ "model.layers.12.mlp.gate_proj.weight": "model-00001-of-00004.safetensors",
454
+ "model.layers.12.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
455
+ "model.layers.12.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
456
+ "model.layers.12.self_attn.k_proj.bias": "model-00001-of-00004.safetensors",
457
+ "model.layers.12.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
458
+ "model.layers.12.self_attn.o_proj.weight": "model-00003-of-00004.safetensors",
459
+ "model.layers.12.self_attn.q_proj.bias": "model-00001-of-00004.safetensors",
460
+ "model.layers.12.self_attn.q_proj.weight": "model-00003-of-00004.safetensors",
461
+ "model.layers.12.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
462
+ "model.layers.12.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
463
+ "model.layers.13.input_layernorm.weight": "model-00003-of-00004.safetensors",
464
+ "model.layers.13.mlp.down_proj.weight": "model-00003-of-00004.safetensors",
465
+ "model.layers.13.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
466
+ "model.layers.13.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
467
+ "model.layers.13.post_attention_layernorm.weight": "model-00002-of-00004.safetensors",
468
+ "model.layers.13.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
469
+ "model.layers.13.self_attn.k_proj.weight": "model-00001-of-00004.safetensors",
470
+ "model.layers.13.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
471
+ "model.layers.13.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
472
+ "model.layers.13.self_attn.q_proj.weight": "model-00002-of-00004.safetensors",
473
+ "model.layers.13.self_attn.v_proj.bias": "model-00001-of-00004.safetensors",
474
+ "model.layers.13.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
475
+ "model.layers.14.input_layernorm.weight": "model-00001-of-00004.safetensors",
476
+ "model.layers.14.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
477
+ "model.layers.14.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
478
+ "model.layers.14.mlp.up_proj.weight": "model-00003-of-00004.safetensors",
479
+ "model.layers.14.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
480
+ "model.layers.14.self_attn.k_proj.bias": "model-00001-of-00004.safetensors",
481
+ "model.layers.14.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
482
+ "model.layers.14.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
483
+ "model.layers.14.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
484
+ "model.layers.14.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
485
+ "model.layers.14.self_attn.v_proj.bias": "model-00003-of-00004.safetensors",
486
+ "model.layers.14.self_attn.v_proj.weight": "model-00003-of-00004.safetensors",
487
+ "model.layers.15.input_layernorm.weight": "model-00001-of-00004.safetensors",
488
+ "model.layers.15.mlp.down_proj.weight": "model-00002-of-00004.safetensors",
489
+ "model.layers.15.mlp.gate_proj.weight": "model-00002-of-00004.safetensors",
490
+ "model.layers.15.mlp.up_proj.weight": "model-00002-of-00004.safetensors",
491
+ "model.layers.15.post_attention_layernorm.weight": "model-00003-of-00004.safetensors",
492
+ "model.layers.15.self_attn.k_proj.bias": "model-00003-of-00004.safetensors",
493
+ "model.layers.15.self_attn.k_proj.weight": "model-00002-of-00004.safetensors",
494
+ "model.layers.15.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
495
+ "model.layers.15.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
496
+ "model.layers.15.self_attn.q_proj.weight": "model-00001-of-00004.safetensors",
497
+ "model.layers.15.self_attn.v_proj.bias": "model-00002-of-00004.safetensors",
498
+ "model.layers.15.self_attn.v_proj.weight": "model-00001-of-00004.safetensors",
499
+ "model.layers.16.input_layernorm.weight": "model-00001-of-00004.safetensors",
500
+ "model.layers.16.mlp.down_proj.weight": "model-00001-of-00004.safetensors",
501
+ "model.layers.16.mlp.gate_proj.weight": "model-00003-of-00004.safetensors",
502
+ "model.layers.16.mlp.up_proj.weight": "model-00001-of-00004.safetensors",
503
+ "model.layers.16.post_attention_layernorm.weight": "model-00001-of-00004.safetensors",
504
+ "model.layers.16.self_attn.k_proj.bias": "model-00001-of-00004.safetensors",
505
+ "model.layers.16.self_attn.k_proj.weight": "model-00003-of-00004.safetensors",
506
+ "model.layers.16.self_attn.o_proj.weight": "model-00002-of-00004.safetensors",
507
+ "model.layers.16.self_attn.q_proj.bias": "model-00002-of-00004.safetensors",
508
+ "model.layers.16.self_attn.q_proj