/home/floriadmin/miniforge3/envs/mlc/bin/python -m mlc_llm gen_config ../dist/models/Qwen1.5-4B --quantization q4f32_1 --conv-template chatml --output /tmp/tmp78htwu3y
[2024-03-18 19:18:16] INFO auto_config.py:115: [92mFound[0m model configuration: ../dist/models/Qwen1.5-4B/config.json
[2024-03-18 19:18:16] INFO auto_config.py:153: [92mFound[0m model type: [1mqwen2[0m. Use `--model-type` to override.
[2024-03-18 19:18:16] INFO qwen2_model.py:46: [1mcontext_window_size[0m not found in config.json. Falling back to [1mmax_position_embeddings[0m (32768)
[2024-03-18 19:18:16] INFO qwen2_model.py:60: [1mprefill_chunk_size[0m defaults to [1mcontext_window_size[0m (32768)
[2024-03-18 19:18:16] WARNING config.py:99: [91mWarning[0m: Cannot override [1mmax_batch_size[0m, because [1mQWen2Config[0m does not have this field
[2024-03-18 19:18:16] INFO gen_config.py:133: [generation_config.json] Setting [1mbos_token_id[0m: 151643
[2024-03-18 19:18:16] INFO gen_config.py:133: [generation_config.json] Setting [1meos_token_id[0m: 151643
[2024-03-18 19:18:16] INFO gen_config.py:147: [91mNot found[0m tokenizer config: ../dist/models/Qwen1.5-4B/tokenizer.model
[2024-03-18 19:18:16] INFO gen_config.py:145: [92mFound[0m tokenizer config: ../dist/models/Qwen1.5-4B/tokenizer.json. Copying to [1m/tmp/tmp78htwu3y/tokenizer.json[0m
[2024-03-18 19:18:16] INFO gen_config.py:145: [92mFound[0m tokenizer config: ../dist/models/Qwen1.5-4B/vocab.json. Copying to [1m/tmp/tmp78htwu3y/vocab.json[0m
[2024-03-18 19:18:16] INFO gen_config.py:145: [92mFound[0m tokenizer config: ../dist/models/Qwen1.5-4B/merges.txt. Copying to [1m/tmp/tmp78htwu3y/merges.txt[0m
[2024-03-18 19:18:16] INFO gen_config.py:147: [91mNot found[0m tokenizer config: ../dist/models/Qwen1.5-4B/added_tokens.json
[2024-03-18 19:18:16] INFO gen_config.py:145: [92mFound[0m tokenizer config: ../dist/models/Qwen1.5-4B/tokenizer_config.json. Copying to [1m/tmp/tmp78htwu3y/tokenizer_config.json[0m
[2024-03-18 19:18:16] INFO gen_config.py:75: [System default] Setting [1mpad_token_id[0m: 0
[2024-03-18 19:18:16] INFO gen_config.py:75: [System default] Setting [1mtemperature[0m: 0.7
[2024-03-18 19:18:16] INFO gen_config.py:75: [System default] Setting [1mpresence_penalty[0m: 0.0
[2024-03-18 19:18:16] INFO gen_config.py:75: [System default] Setting [1mfrequency_penalty[0m: 0.0
[2024-03-18 19:18:16] INFO gen_config.py:75: [System default] Setting [1mrepetition_penalty[0m: 1.0
[2024-03-18 19:18:16] INFO gen_config.py:75: [System default] Setting [1mtop_p[0m: 0.95
[2024-03-18 19:18:16] INFO gen_config.py:75: [System default] Setting [1mmean_gen_len[0m: 128
[2024-03-18 19:18:16] INFO gen_config.py:75: [System default] Setting [1mmax_gen_len[0m: 512
[2024-03-18 19:18:16] INFO gen_config.py:75: [System default] Setting [1mshift_fill_factor[0m: 0.3
[2024-03-18 19:18:16] INFO gen_config.py:198: Dumping configuration file to: [1m/tmp/tmp78htwu3y/mlc-chat-config.json[0m
/home/floriadmin/miniforge3/envs/mlc/bin/python -m mlc_llm convert_weight ../dist/models/Qwen1.5-4B --quantization q4f32_1 --source-format auto --output /tmp/tmp78htwu3y
[2024-03-18 19:18:17] INFO auto_config.py:115: [92mFound[0m model configuration: ../dist/models/Qwen1.5-4B/config.json
[2024-03-18 19:18:18] INFO auto_device.py:76: [92mFound[0m device: cuda:0
[2024-03-18 19:18:18] INFO auto_device.py:76: [92mFound[0m device: cuda:1
[2024-03-18 19:18:18] INFO auto_device.py:76: [92mFound[0m device: cuda:2
[2024-03-18 19:18:18] INFO auto_device.py:76: [92mFound[0m device: cuda:3
[2024-03-18 19:18:18] INFO auto_device.py:76: [92mFound[0m device: cuda:4
[2024-03-18 19:18:18] INFO auto_device.py:76: [92mFound[0m device: cuda:5
[2024-03-18 19:18:18] INFO auto_device.py:76: [92mFound[0m device: cuda:6
[2024-03-18 19:18:18] INFO auto_device.py:76: [92mFound[0m device: cuda:7
[2024-03-18 19:18:18] INFO auto_device.py:76: [92mFound[0m device: cuda:8
[2024-03-18 19:18:18] INFO auto_device.py:76: [92mFound[0m device: cuda:9
[2024-03-18 19:18:19] INFO auto_device.py:85: [91mNot found[0m device: rocm:0
[2024-03-18 19:18:20] INFO auto_device.py:85: [91mNot found[0m device: metal:0
[2024-03-18 19:18:22] INFO auto_device.py:76: [92mFound[0m device: vulkan:0
[2024-03-18 19:18:22] INFO auto_device.py:76: [92mFound[0m device: vulkan:1
[2024-03-18 19:18:22] INFO auto_device.py:76: [92mFound[0m device: vulkan:2
[2024-03-18 19:18:22] INFO auto_device.py:76: [92mFound[0m device: vulkan:3
[2024-03-18 19:18:22] INFO auto_device.py:76: [92mFound[0m device: vulkan:4
[2024-03-18 19:18:22] INFO auto_device.py:76: [92mFound[0m device: vulkan:5
[2024-03-18 19:18:22] INFO auto_device.py:76: [92mFound[0m device: vulkan:6
[2024-03-18 19:18:22] INFO auto_device.py:76: [92mFound[0m device: vulkan:7
[2024-03-18 19:18:22] INFO auto_device.py:76: [92mFound[0m device: vulkan:8
[2024-03-18 19:18:22] INFO auto_device.py:76: [92mFound[0m device: vulkan:9
[2024-03-18 19:18:22] INFO auto_device.py:76: [92mFound[0m device: vulkan:10
[2024-03-18 19:18:23] INFO auto_device.py:85: [91mNot found[0m device: opencl:0
[2024-03-18 19:18:23] INFO auto_device.py:33: Using device: [1mcuda:0[0m
[2024-03-18 19:18:23] INFO auto_weight.py:70: Finding weights in: ../dist/models/Qwen1.5-4B
[2024-03-18 19:18:23] INFO auto_weight.py:136: [91mNot found[0m Huggingface PyTorch
[2024-03-18 19:18:23] INFO auto_weight.py:143: [92mFound[0m source weight format: huggingface-safetensor. Source configuration: ../dist/models/Qwen1.5-4B/model.safetensors.index.json
[2024-03-18 19:18:23] INFO auto_weight.py:106: Using source weight configuration: [1m../dist/models/Qwen1.5-4B/model.safetensors.index.json[0m. Use `--source` to override.
[2024-03-18 19:18:23] INFO auto_weight.py:110: Using source weight format: [1mhuggingface-safetensor[0m. Use `--source-format` to override.
[2024-03-18 19:18:23] INFO auto_config.py:153: [92mFound[0m model type: [1mqwen2[0m. Use `--model-type` to override.
[2024-03-18 19:18:23] INFO qwen2_model.py:46: [1mcontext_window_size[0m not found in config.json. Falling back to [1mmax_position_embeddings[0m (32768)
[2024-03-18 19:18:23] INFO qwen2_model.py:60: [1mprefill_chunk_size[0m defaults to [1mcontext_window_size[0m (32768)
[1mWeight conversion with arguments:[0m
  [1m--config[0m          ../dist/models/Qwen1.5-4B/config.json
  [1m--quantization[0m    GroupQuantize(name='q4f32_1', kind='group-quant', group_size=40, quantize_dtype='int4', storage_dtype='uint32', model_dtype='float32', linear_weight_layout='NK', quantize_embedding=True, quantize_final_fc=True, num_elem_per_storage=8, num_storage_per_group=5, max_int_value=7)
  [1m--model-type[0m      qwen2
  [1m--device[0m          cuda:0
  [1m--source[0m          ../dist/models/Qwen1.5-4B/model.safetensors.index.json
  [1m--source-format[0m   huggingface-safetensor
  [1m--output[0m          /tmp/tmp78htwu3y
Start storing to cache /tmp/tmp78htwu3y
  0%|                                                                                                    | 0/283 [00:00<?, ?it/s]                                                                                                                                 [2024-03-18 19:18:26] INFO huggingface_loader.py:182: Loading HF parameters from: ../dist/models/Qwen1.5-4B/model-00002-of-00002.safetensors
  0%|                                                                                                    | 0/283 [00:00<?, ?it/s]                                                                                                                                 [2024-03-18 19:18:44] INFO group_quantization.py:232: Compiling quantize function for key: ((151936, 2560), float32, cuda, axis=1, output_transpose=False)
  0%|                                                                                                    | 0/283 [00:17<?, ?it/s]                                                                                                                                 [2024-03-18 19:18:46] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mlm_head.q_weight[0m", shape: (151936, 320), dtype: uint32
  0%|                                                                                                    | 0/283 [00:19<?, ?it/s]                                                                                                                                 [2024-03-18 19:18:47] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mlm_head.q_scale[0m", shape: (151936, 64), dtype: float32
  0%|                                                                                                    | 0/283 [00:20<?, ?it/s]/home/floriadmin/miniforge3/envs/mlc/lib/python3.11/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.
  setattr(self, word, getattr(machar, word).flat[0])
/home/floriadmin/miniforge3/envs/mlc/lib/python3.11/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.
  return self._float_to_str(self.smallest_subnormal)
  0%|▎                                                                                         | 1/283 [00:20<1:36:34, 20.55s/it]                                                                                                                                 [2024-03-18 19:18:47] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.20.input_layernorm.weight[0m", shape: (2560,), dtype: float32
  0%|▎                                                                                         | 1/283 [00:20<1:36:34, 20.55s/it]                                                                                                                                 [2024-03-18 19:18:47] INFO group_quantization.py:232: Compiling quantize function for key: ((2560, 6912), float32, cuda, axis=1, output_transpose=False)
  0%|▎                                                                                         | 1/283 [00:20<1:36:34, 20.55s/it]                                                                                                                                 [2024-03-18 19:18:47] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.20.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
  0%|▎                                                                                         | 1/283 [00:21<1:36:34, 20.55s/it]                                                                                                                                 [2024-03-18 19:18:48] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.20.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
  0%|▎                                                                                         | 1/283 [00:21<1:36:34, 20.55s/it]  1%|▉                                                                                           | 3/283 [00:21<26:04,  5.59s/it]                                                                                                                                 [2024-03-18 19:18:48] INFO group_quantization.py:232: Compiling quantize function for key: ((13824, 2560), float32, cuda, axis=1, output_transpose=False)
  1%|▉                                                                                           | 3/283 [00:21<26:04,  5.59s/it]                                                                                                                                 [2024-03-18 19:18:48] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.20.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
  1%|▉                                                                                           | 3/283 [00:22<26:04,  5.59s/it]                                                                                                                                 [2024-03-18 19:18:48] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.20.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
  1%|▉                                                                                           | 3/283 [00:22<26:04,  5.59s/it]  1%|█▎                                                                                          | 4/283 [00:22<18:33,  3.99s/it]                                                                                                                                 [2024-03-18 19:18:49] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.20.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
  1%|█▎                                                                                          | 4/283 [00:22<18:33,  3.99s/it]                                                                                                                                 [2024-03-18 19:18:49] INFO group_quantization.py:232: Compiling quantize function for key: ((2560, 2560), float32, cuda, axis=1, output_transpose=False)
  1%|█▎                                                                                          | 4/283 [00:22<18:33,  3.99s/it]                                                                                                                                 [2024-03-18 19:18:49] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.20.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
  1%|█▎                                                                                          | 4/283 [00:22<18:33,  3.99s/it]                                                                                                                                 [2024-03-18 19:18:49] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.20.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
  1%|█▎                                                                                          | 4/283 [00:22<18:33,  3.99s/it]  2%|█▉                                                                                          | 6/283 [00:22<09:55,  2.15s/it]                                                                                                                                 [2024-03-18 19:18:49] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.21.input_layernorm.weight[0m", shape: (2560,), dtype: float32
  2%|█▉                                                                                          | 6/283 [00:22<09:55,  2.15s/it]                                                                                                                                 [2024-03-18 19:18:49] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.21.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
  2%|█▉                                                                                          | 6/283 [00:23<09:55,  2.15s/it]                                                                                                                                 [2024-03-18 19:18:49] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.21.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
  2%|█▉                                                                                          | 6/283 [00:23<09:55,  2.15s/it]  3%|██▌                                                                                         | 8/283 [00:23<06:00,  1.31s/it]                                                                                                                                 [2024-03-18 19:18:50] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.21.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
  3%|██▌                                                                                         | 8/283 [00:23<06:00,  1.31s/it]                                                                                                                                 [2024-03-18 19:18:50] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.21.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
  3%|██▌                                                                                         | 8/283 [00:23<06:00,  1.31s/it]  3%|██▉                                                                                         | 9/283 [00:23<05:09,  1.13s/it]                                                                                                                                 [2024-03-18 19:18:50] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.21.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
  3%|██▉                                                                                         | 9/283 [00:23<05:09,  1.13s/it]                                                                                                                                 [2024-03-18 19:18:50] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.21.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
  3%|██▉                                                                                         | 9/283 [00:23<05:09,  1.13s/it]                                                                                                                                 [2024-03-18 19:18:50] INFO group_quantization.py:232: Compiling quantize function for key: ((7680, 2560), float32, cuda, axis=1, output_transpose=False)
  3%|██▉                                                                                         | 9/283 [00:23<05:09,  1.13s/it]                                                                                                                                 [2024-03-18 19:18:51] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.21.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
  3%|██▉                                                                                         | 9/283 [00:24<05:09,  1.13s/it]                                                                                                                                 [2024-03-18 19:18:51] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.21.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
  3%|██▉                                                                                         | 9/283 [00:24<05:09,  1.13s/it]  4%|███▊                                                                                       | 12/283 [00:24<03:12,  1.41it/s]                                                                                                                                 [2024-03-18 19:18:51] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.21.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
  4%|███▊                                                                                       | 12/283 [00:24<03:12,  1.41it/s]                                                                                                                                 [2024-03-18 19:18:51] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.21.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
  4%|███▊                                                                                       | 12/283 [00:24<03:12,  1.41it/s]                                                                                                                                 [2024-03-18 19:18:51] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.22.input_layernorm.weight[0m", shape: (2560,), dtype: float32
  4%|███▊                                                                                       | 12/283 [00:24<03:12,  1.41it/s]                                                                                                                                 [2024-03-18 19:18:51] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.22.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
  4%|███▊                                                                                       | 12/283 [00:24<03:12,  1.41it/s]                                                                                                                                 [2024-03-18 19:18:51] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.22.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
  4%|███▊                                                                                       | 12/283 [00:24<03:12,  1.41it/s]  5%|████▊                                                                                      | 15/283 [00:24<02:03,  2.17it/s]                                                                                                                                 [2024-03-18 19:18:51] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.22.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
  5%|████▊                                                                                      | 15/283 [00:25<02:03,  2.17it/s]                                                                                                                                 [2024-03-18 19:18:52] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.22.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
  5%|████▊                                                                                      | 15/283 [00:25<02:03,  2.17it/s]  6%|█████▏                                                                                     | 16/283 [00:25<02:06,  2.11it/s]                                                                                                                                 [2024-03-18 19:18:52] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.22.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
  6%|█████▏                                                                                     | 16/283 [00:25<02:06,  2.11it/s]                                                                                                                                 [2024-03-18 19:18:52] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.22.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
  6%|█████▏                                                                                     | 16/283 [00:25<02:06,  2.11it/s]                                                                                                                                 [2024-03-18 19:18:52] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.22.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
  6%|█████▏                                                                                     | 16/283 [00:25<02:06,  2.11it/s]                                                                                                                                 [2024-03-18 19:18:52] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.22.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
  6%|█████▏                                                                                     | 16/283 [00:25<02:06,  2.11it/s]  7%|██████                                                                                     | 19/283 [00:25<01:25,  3.10it/s]                                                                                                                                 [2024-03-18 19:18:52] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.22.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
  7%|██████                                                                                     | 19/283 [00:25<01:25,  3.10it/s]                                                                                                                                 [2024-03-18 19:18:52] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.22.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
  7%|██████                                                                                     | 19/283 [00:25<01:25,  3.10it/s]                                                                                                                                 [2024-03-18 19:18:52] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.23.input_layernorm.weight[0m", shape: (2560,), dtype: float32
  7%|██████                                                                                     | 19/283 [00:25<01:25,  3.10it/s]                                                                                                                                 [2024-03-18 19:18:52] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.23.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
  7%|██████                                                                                     | 19/283 [00:25<01:25,  3.10it/s]                                                                                                                                 [2024-03-18 19:18:52] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.23.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
  7%|██████                                                                                     | 19/283 [00:25<01:25,  3.10it/s]  8%|███████                                                                                    | 22/283 [00:25<01:03,  4.12it/s]                                                                                                                                 [2024-03-18 19:18:53] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.23.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
  8%|███████                                                                                    | 22/283 [00:26<01:03,  4.12it/s]                                                                                                                                 [2024-03-18 19:18:53] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.23.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
  8%|███████                                                                                    | 22/283 [00:26<01:03,  4.12it/s]  8%|███████▍                                                                                   | 23/283 [00:26<01:14,  3.47it/s]                                                                                                                                 [2024-03-18 19:18:53] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.23.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
  8%|███████▍                                                                                   | 23/283 [00:26<01:14,  3.47it/s]                                                                                                                                 [2024-03-18 19:18:53] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.23.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
  8%|███████▍                                                                                   | 23/283 [00:26<01:14,  3.47it/s]                                                                                                                                 [2024-03-18 19:18:53] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.23.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
  8%|███████▍                                                                                   | 23/283 [00:26<01:14,  3.47it/s]                                                                                                                                 [2024-03-18 19:18:53] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.23.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
  8%|███████▍                                                                                   | 23/283 [00:26<01:14,  3.47it/s]  9%|████████▎                                                                                  | 26/283 [00:26<00:56,  4.56it/s]                                                                                                                                 [2024-03-18 19:18:53] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.23.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
  9%|████████▎                                                                                  | 26/283 [00:26<00:56,  4.56it/s]                                                                                                                                 [2024-03-18 19:18:53] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.23.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
  9%|████████▎                                                                                  | 26/283 [00:26<00:56,  4.56it/s]                                                                                                                                 [2024-03-18 19:18:53] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.24.input_layernorm.weight[0m", shape: (2560,), dtype: float32
  9%|████████▎                                                                                  | 26/283 [00:26<00:56,  4.56it/s]                                                                                                                                 [2024-03-18 19:18:53] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.24.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
  9%|████████▎                                                                                  | 26/283 [00:27<00:56,  4.56it/s]                                                                                                                                 [2024-03-18 19:18:53] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.24.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
  9%|████████▎                                                                                  | 26/283 [00:27<00:56,  4.56it/s] 10%|█████████▎                                                                                 | 29/283 [00:27<00:45,  5.54it/s]                                                                                                                                 [2024-03-18 19:18:54] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.24.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 10%|█████████▎                                                                                 | 29/283 [00:27<00:45,  5.54it/s]                                                                                                                                 [2024-03-18 19:18:54] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.24.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 10%|█████████▎                                                                                 | 29/283 [00:27<00:45,  5.54it/s] 11%|█████████▋                                                                                 | 30/283 [00:27<00:58,  4.35it/s]                                                                                                                                 [2024-03-18 19:18:54] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.24.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 11%|█████████▋                                                                                 | 30/283 [00:27<00:58,  4.35it/s]                                                                                                                                 [2024-03-18 19:18:54] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.24.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 11%|█████████▋                                                                                 | 30/283 [00:27<00:58,  4.35it/s]                                                                                                                                 [2024-03-18 19:18:54] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.24.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 11%|█████████▋                                                                                 | 30/283 [00:27<00:58,  4.35it/s]                                                                                                                                 [2024-03-18 19:18:54] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.24.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 11%|█████████▋                                                                                 | 30/283 [00:27<00:58,  4.35it/s] 12%|██████████▌                                                                                | 33/283 [00:28<00:46,  5.43it/s]                                                                                                                                 [2024-03-18 19:18:54] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.24.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 12%|██████████▌                                                                                | 33/283 [00:28<00:46,  5.43it/s]                                                                                                                                 [2024-03-18 19:18:54] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.24.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 12%|██████████▌                                                                                | 33/283 [00:28<00:46,  5.43it/s]                                                                                                                                 [2024-03-18 19:18:54] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.25.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 12%|██████████▌                                                                                | 33/283 [00:28<00:46,  5.43it/s]                                                                                                                                 [2024-03-18 19:18:55] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.25.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 12%|██████████▌                                                                                | 33/283 [00:28<00:46,  5.43it/s]                                                                                                                                 [2024-03-18 19:18:55] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.25.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 12%|██████████▌                                                                                | 33/283 [00:28<00:46,  5.43it/s] 13%|███████████▌                                                                               | 36/283 [00:28<00:38,  6.35it/s]                                                                                                                                 [2024-03-18 19:18:55] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.25.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 13%|███████████▌                                                                               | 36/283 [00:28<00:38,  6.35it/s]                                                                                                                                 [2024-03-18 19:18:55] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.25.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 13%|███████████▌                                                                               | 36/283 [00:28<00:38,  6.35it/s] 13%|███████████▉                                                                               | 37/283 [00:28<00:51,  4.78it/s]                                                                                                                                 [2024-03-18 19:18:55] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.25.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 13%|███████████▉                                                                               | 37/283 [00:28<00:51,  4.78it/s]                                                                                                                                 [2024-03-18 19:18:55] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.25.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 13%|███████████▉                                                                               | 37/283 [00:28<00:51,  4.78it/s]                                                                                                                                 [2024-03-18 19:18:55] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.25.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 13%|███████████▉                                                                               | 37/283 [00:29<00:51,  4.78it/s]                                                                                                                                 [2024-03-18 19:18:55] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.25.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 13%|███████████▉                                                                               | 37/283 [00:29<00:51,  4.78it/s] 14%|████████████▊                                                                              | 40/283 [00:29<00:41,  5.87it/s]                                                                                                                                 [2024-03-18 19:18:56] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.25.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 14%|████████████▊                                                                              | 40/283 [00:29<00:41,  5.87it/s]                                                                                                                                 [2024-03-18 19:18:56] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.25.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 14%|████████████▊                                                                              | 40/283 [00:29<00:41,  5.87it/s]                                                                                                                                 [2024-03-18 19:18:56] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.26.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 14%|████████████▊                                                                              | 40/283 [00:29<00:41,  5.87it/s]                                                                                                                                 [2024-03-18 19:18:56] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.26.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 14%|████████████▊                                                                              | 40/283 [00:29<00:41,  5.87it/s]                                                                                                                                 [2024-03-18 19:18:56] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.26.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 14%|████████████▊                                                                              | 40/283 [00:29<00:41,  5.87it/s] 15%|█████████████▊                                                                             | 43/283 [00:29<00:35,  6.74it/s]                                                                                                                                 [2024-03-18 19:18:56] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.26.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 15%|█████████████▊                                                                             | 43/283 [00:29<00:35,  6.74it/s]                                                                                                                                 [2024-03-18 19:18:56] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.26.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 15%|█████████████▊                                                                             | 43/283 [00:30<00:35,  6.74it/s] 16%|██████████████▏                                                                            | 44/283 [00:30<00:48,  4.94it/s]                                                                                                                                 [2024-03-18 19:18:56] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.26.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 16%|██████████████▏                                                                            | 44/283 [00:30<00:48,  4.94it/s]                                                                                                                                 [2024-03-18 19:18:56] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.26.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 16%|██████████████▏                                                                            | 44/283 [00:30<00:48,  4.94it/s]                                                                                                                                 [2024-03-18 19:18:57] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.26.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 16%|██████████████▏                                                                            | 44/283 [00:30<00:48,  4.94it/s]                                                                                                                                 [2024-03-18 19:18:57] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.26.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 16%|██████████████▏                                                                            | 44/283 [00:30<00:48,  4.94it/s] 17%|███████████████                                                                            | 47/283 [00:30<00:39,  5.92it/s]                                                                                                                                 [2024-03-18 19:18:57] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.26.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 17%|███████████████                                                                            | 47/283 [00:30<00:39,  5.92it/s]                                                                                                                                 [2024-03-18 19:18:57] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.26.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 17%|███████████████                                                                            | 47/283 [00:30<00:39,  5.92it/s]                                                                                                                                 [2024-03-18 19:18:57] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.27.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 17%|███████████████                                                                            | 47/283 [00:30<00:39,  5.92it/s]                                                                                                                                 [2024-03-18 19:18:57] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.27.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 17%|███████████████                                                                            | 47/283 [00:30<00:39,  5.92it/s]                                                                                                                                 [2024-03-18 19:18:57] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.27.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 17%|███████████████                                                                            | 47/283 [00:30<00:39,  5.92it/s] 18%|████████████████                                                                           | 50/283 [00:30<00:34,  6.79it/s]                                                                                                                                 [2024-03-18 19:18:57] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.27.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 18%|████████████████                                                                           | 50/283 [00:31<00:34,  6.79it/s]                                                                                                                                 [2024-03-18 19:18:57] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.27.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 18%|████████████████                                                                           | 50/283 [00:31<00:34,  6.79it/s] 18%|████████████████▍                                                                          | 51/283 [00:31<00:46,  4.99it/s]                                                                                                                                 [2024-03-18 19:18:57] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.27.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 18%|████████████████▍                                                                          | 51/283 [00:31<00:46,  4.99it/s]                                                                                                                                 [2024-03-18 19:18:57] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.27.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 18%|████████████████▍                                                                          | 51/283 [00:31<00:46,  4.99it/s]                                                                                                                                 [2024-03-18 19:18:58] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.27.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 18%|████████████████▍                                                                          | 51/283 [00:31<00:46,  4.99it/s]                                                                                                                                 [2024-03-18 19:18:58] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.27.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 18%|████████████████▍                                                                          | 51/283 [00:31<00:46,  4.99it/s] 19%|█████████████████▎                                                                         | 54/283 [00:31<00:38,  6.03it/s]                                                                                                                                 [2024-03-18 19:18:58] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.27.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 19%|█████████████████▎                                                                         | 54/283 [00:31<00:38,  6.03it/s]                                                                                                                                 [2024-03-18 19:18:58] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.27.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 19%|█████████████████▎                                                                         | 54/283 [00:31<00:38,  6.03it/s]                                                                                                                                 [2024-03-18 19:18:58] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.28.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 19%|█████████████████▎                                                                         | 54/283 [00:31<00:38,  6.03it/s]                                                                                                                                 [2024-03-18 19:18:58] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.28.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 19%|█████████████████▎                                                                         | 54/283 [00:31<00:38,  6.03it/s]                                                                                                                                 [2024-03-18 19:18:58] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.28.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 19%|█████████████████▎                                                                         | 54/283 [00:31<00:38,  6.03it/s] 20%|██████████████████▎                                                                        | 57/283 [00:31<00:32,  6.85it/s]                                                                                                                                 [2024-03-18 19:18:59] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.28.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 20%|██████████████████▎                                                                        | 57/283 [00:32<00:32,  6.85it/s]                                                                                                                                 [2024-03-18 19:18:59] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.28.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 20%|██████████████████▎                                                                        | 57/283 [00:32<00:32,  6.85it/s] 20%|██████████████████▋                                                                        | 58/283 [00:32<00:45,  4.98it/s]                                                                                                                                 [2024-03-18 19:18:59] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.28.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 20%|██████████████████▋                                                                        | 58/283 [00:32<00:45,  4.98it/s]                                                                                                                                 [2024-03-18 19:18:59] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.28.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 20%|██████████████████▋                                                                        | 58/283 [00:32<00:45,  4.98it/s]                                                                                                                                 [2024-03-18 19:18:59] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.28.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 20%|██████████████████▋                                                                        | 58/283 [00:32<00:45,  4.98it/s]                                                                                                                                 [2024-03-18 19:18:59] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.28.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 20%|██████████████████▋                                                                        | 58/283 [00:32<00:45,  4.98it/s] 22%|███████████████████▌                                                                       | 61/283 [00:32<00:36,  6.05it/s]                                                                                                                                 [2024-03-18 19:18:59] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.28.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 22%|███████████████████▌                                                                       | 61/283 [00:32<00:36,  6.05it/s]                                                                                                                                 [2024-03-18 19:18:59] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.28.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 22%|███████████████████▌                                                                       | 61/283 [00:32<00:36,  6.05it/s]                                                                                                                                 [2024-03-18 19:18:59] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.29.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 22%|███████████████████▌                                                                       | 61/283 [00:32<00:36,  6.05it/s]                                                                                                                                 [2024-03-18 19:18:59] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.29.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 22%|███████████████████▌                                                                       | 61/283 [00:33<00:36,  6.05it/s]                                                                                                                                 [2024-03-18 19:18:59] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.29.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 22%|███████████████████▌                                                                       | 61/283 [00:33<00:36,  6.05it/s] 23%|████████████████████▌                                                                      | 64/283 [00:33<00:31,  6.86it/s]                                                                                                                                 [2024-03-18 19:19:00] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.29.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 23%|████████████████████▌                                                                      | 64/283 [00:33<00:31,  6.86it/s]                                                                                                                                 [2024-03-18 19:19:00] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.29.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 23%|████████████████████▌                                                                      | 64/283 [00:33<00:31,  6.86it/s] 23%|████████████████████▉                                                                      | 65/283 [00:33<00:43,  5.01it/s]                                                                                                                                 [2024-03-18 19:19:00] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.29.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 23%|████████████████████▉                                                                      | 65/283 [00:33<00:43,  5.01it/s]                                                                                                                                 [2024-03-18 19:19:00] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.29.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 23%|████████████████████▉                                                                      | 65/283 [00:33<00:43,  5.01it/s]                                                                                                                                 [2024-03-18 19:19:00] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.29.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 23%|████████████████████▉                                                                      | 65/283 [00:33<00:43,  5.01it/s]                                                                                                                                 [2024-03-18 19:19:00] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.29.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 23%|████████████████████▉                                                                      | 65/283 [00:33<00:43,  5.01it/s] 24%|█████████████████████▊                                                                     | 68/283 [00:33<00:35,  6.07it/s]                                                                                                                                 [2024-03-18 19:19:00] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.29.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 24%|█████████████████████▊                                                                     | 68/283 [00:33<00:35,  6.07it/s]                                                                                                                                 [2024-03-18 19:19:00] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.29.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 24%|█████████████████████▊                                                                     | 68/283 [00:33<00:35,  6.07it/s]                                                                                                                                 [2024-03-18 19:19:00] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.30.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 24%|█████████████████████▊                                                                     | 68/283 [00:33<00:35,  6.07it/s]                                                                                                                                 [2024-03-18 19:19:00] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.30.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 24%|█████████████████████▊                                                                     | 68/283 [00:34<00:35,  6.07it/s]                                                                                                                                 [2024-03-18 19:19:00] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.30.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 24%|█████████████████████▊                                                                     | 68/283 [00:34<00:35,  6.07it/s] 25%|██████████████████████▊                                                                    | 71/283 [00:34<00:30,  6.90it/s]                                                                                                                                 [2024-03-18 19:19:01] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.30.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 25%|██████████████████████▊                                                                    | 71/283 [00:34<00:30,  6.90it/s]                                                                                                                                 [2024-03-18 19:19:01] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.30.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 25%|██████████████████████▊                                                                    | 71/283 [00:34<00:30,  6.90it/s] 25%|███████████████████████▏                                                                   | 72/283 [00:34<00:41,  5.04it/s]                                                                                                                                 [2024-03-18 19:19:01] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.30.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 25%|███████████████████████▏                                                                   | 72/283 [00:34<00:41,  5.04it/s]                                                                                                                                 [2024-03-18 19:19:01] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.30.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 25%|███████████████████████▏                                                                   | 72/283 [00:34<00:41,  5.04it/s]                                                                                                                                 [2024-03-18 19:19:01] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.30.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 25%|███████████████████████▏                                                                   | 72/283 [00:34<00:41,  5.04it/s]                                                                                                                                 [2024-03-18 19:19:01] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.30.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 25%|███████████████████████▏                                                                   | 72/283 [00:35<00:41,  5.04it/s] 27%|████████████████████████                                                                   | 75/283 [00:35<00:34,  6.09it/s]                                                                                                                                 [2024-03-18 19:19:01] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.30.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 27%|████████████████████████                                                                   | 75/283 [00:35<00:34,  6.09it/s]                                                                                                                                 [2024-03-18 19:19:01] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.30.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 27%|████████████████████████                                                                   | 75/283 [00:35<00:34,  6.09it/s]                                                                                                                                 [2024-03-18 19:19:01] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.31.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 27%|████████████████████████                                                                   | 75/283 [00:35<00:34,  6.09it/s]                                                                                                                                 [2024-03-18 19:19:02] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.31.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 27%|████████████████████████                                                                   | 75/283 [00:35<00:34,  6.09it/s]                                                                                                                                 [2024-03-18 19:19:02] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.31.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 27%|████████████████████████                                                                   | 75/283 [00:35<00:34,  6.09it/s] 28%|█████████████████████████                                                                  | 78/283 [00:35<00:29,  6.93it/s]                                                                                                                                 [2024-03-18 19:19:02] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.31.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 28%|█████████████████████████                                                                  | 78/283 [00:35<00:29,  6.93it/s]                                                                                                                                 [2024-03-18 19:19:02] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.31.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 28%|█████████████████████████                                                                  | 78/283 [00:35<00:29,  6.93it/s] 28%|█████████████████████████▍                                                                 | 79/283 [00:35<00:40,  5.05it/s]                                                                                                                                 [2024-03-18 19:19:02] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.31.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 28%|█████████████████████████▍                                                                 | 79/283 [00:35<00:40,  5.05it/s]                                                                                                                                 [2024-03-18 19:19:02] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.31.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 28%|█████████████████████████▍                                                                 | 79/283 [00:35<00:40,  5.05it/s]                                                                                                                                 [2024-03-18 19:19:02] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.31.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 28%|█████████████████████████▍                                                                 | 79/283 [00:36<00:40,  5.05it/s]                                                                                                                                 [2024-03-18 19:19:02] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.31.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 28%|█████████████████████████▍                                                                 | 79/283 [00:36<00:40,  5.05it/s] 29%|██████████████████████████▎                                                                | 82/283 [00:36<00:33,  6.09it/s]                                                                                                                                 [2024-03-18 19:19:03] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.31.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 29%|██████████████████████████▎                                                                | 82/283 [00:36<00:33,  6.09it/s]                                                                                                                                 [2024-03-18 19:19:03] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.31.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 29%|██████████████████████████▎                                                                | 82/283 [00:36<00:33,  6.09it/s]                                                                                                                                 [2024-03-18 19:19:03] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.32.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 29%|██████████████████████████▎                                                                | 82/283 [00:36<00:33,  6.09it/s]                                                                                                                                 [2024-03-18 19:19:03] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.32.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 29%|██████████████████████████▎                                                                | 82/283 [00:36<00:33,  6.09it/s]                                                                                                                                 [2024-03-18 19:19:03] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.32.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 29%|██████████████████████████▎                                                                | 82/283 [00:36<00:33,  6.09it/s] 30%|███████████████████████████▎                                                               | 85/283 [00:36<00:28,  6.89it/s]                                                                                                                                 [2024-03-18 19:19:03] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.32.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 30%|███████████████████████████▎                                                               | 85/283 [00:36<00:28,  6.89it/s]                                                                                                                                 [2024-03-18 19:19:03] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.32.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 30%|███████████████████████████▎                                                               | 85/283 [00:37<00:28,  6.89it/s] 30%|███████████████████████████▋                                                               | 86/283 [00:37<00:39,  5.03it/s]                                                                                                                                 [2024-03-18 19:19:03] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.32.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 30%|███████████████████████████▋                                                               | 86/283 [00:37<00:39,  5.03it/s]                                                                                                                                 [2024-03-18 19:19:03] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.32.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 30%|███████████████████████████▋                                                               | 86/283 [00:37<00:39,  5.03it/s]                                                                                                                                 [2024-03-18 19:19:04] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.32.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 30%|███████████████████████████▋                                                               | 86/283 [00:37<00:39,  5.03it/s]                                                                                                                                 [2024-03-18 19:19:04] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.32.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 30%|███████████████████████████▋                                                               | 86/283 [00:37<00:39,  5.03it/s] 31%|████████████████████████████▌                                                              | 89/283 [00:37<00:31,  6.10it/s]                                                                                                                                 [2024-03-18 19:19:04] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.32.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 31%|████████████████████████████▌                                                              | 89/283 [00:37<00:31,  6.10it/s]                                                                                                                                 [2024-03-18 19:19:04] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.32.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 31%|████████████████████████████▌                                                              | 89/283 [00:37<00:31,  6.10it/s]                                                                                                                                 [2024-03-18 19:19:04] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.33.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 31%|████████████████████████████▌                                                              | 89/283 [00:37<00:31,  6.10it/s]                                                                                                                                 [2024-03-18 19:19:04] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.33.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 31%|████████████████████████████▌                                                              | 89/283 [00:37<00:31,  6.10it/s]                                                                                                                                 [2024-03-18 19:19:04] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.33.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 31%|████████████████████████████▌                                                              | 89/283 [00:37<00:31,  6.10it/s] 33%|█████████████████████████████▌                                                             | 92/283 [00:37<00:27,  6.93it/s]                                                                                                                                 [2024-03-18 19:19:04] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.33.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 33%|█████████████████████████████▌                                                             | 92/283 [00:38<00:27,  6.93it/s]                                                                                                                                 [2024-03-18 19:19:05] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.33.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 33%|█████████████████████████████▌                                                             | 92/283 [00:38<00:27,  6.93it/s] 33%|█████████████████████████████▉                                                             | 93/283 [00:38<00:37,  5.06it/s]                                                                                                                                 [2024-03-18 19:19:05] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.33.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 33%|█████████████████████████████▉                                                             | 93/283 [00:38<00:37,  5.06it/s]                                                                                                                                 [2024-03-18 19:19:05] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.33.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 33%|█████████████████████████████▉                                                             | 93/283 [00:38<00:37,  5.06it/s]                                                                                                                                 [2024-03-18 19:19:05] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.33.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 33%|█████████████████████████████▉                                                             | 93/283 [00:38<00:37,  5.06it/s]                                                                                                                                 [2024-03-18 19:19:05] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.33.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 33%|█████████████████████████████▉                                                             | 93/283 [00:38<00:37,  5.06it/s] 34%|██████████████████████████████▊                                                            | 96/283 [00:38<00:30,  6.14it/s]                                                                                                                                 [2024-03-18 19:19:05] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.33.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 34%|██████████████████████████████▊                                                            | 96/283 [00:38<00:30,  6.14it/s]                                                                                                                                 [2024-03-18 19:19:05] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.33.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 34%|██████████████████████████████▊                                                            | 96/283 [00:38<00:30,  6.14it/s]                                                                                                                                 [2024-03-18 19:19:05] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.34.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 34%|██████████████████████████████▊                                                            | 96/283 [00:38<00:30,  6.14it/s]                                                                                                                                 [2024-03-18 19:19:05] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.34.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 34%|██████████████████████████████▊                                                            | 96/283 [00:38<00:30,  6.14it/s]                                                                                                                                 [2024-03-18 19:19:05] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.34.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 34%|██████████████████████████████▊                                                            | 96/283 [00:38<00:30,  6.14it/s] 35%|███████████████████████████████▊                                                           | 99/283 [00:38<00:26,  6.96it/s]                                                                                                                                 [2024-03-18 19:19:06] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.34.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 35%|███████████████████████████████▊                                                           | 99/283 [00:39<00:26,  6.96it/s]                                                                                                                                 [2024-03-18 19:19:06] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.34.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 35%|███████████████████████████████▊                                                           | 99/283 [00:39<00:26,  6.96it/s] 35%|███████████████████████████████▊                                                          | 100/283 [00:39<00:36,  5.06it/s]                                                                                                                                 [2024-03-18 19:19:06] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.34.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 35%|███████████████████████████████▊                                                          | 100/283 [00:39<00:36,  5.06it/s]                                                                                                                                 [2024-03-18 19:19:06] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.34.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 35%|███████████████████████████████▊                                                          | 100/283 [00:39<00:36,  5.06it/s]                                                                                                                                 [2024-03-18 19:19:06] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.34.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 35%|███████████████████████████████▊                                                          | 100/283 [00:39<00:36,  5.06it/s]                                                                                                                                 [2024-03-18 19:19:06] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.34.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 35%|███████████████████████████████▊                                                          | 100/283 [00:39<00:36,  5.06it/s] 36%|████████████████████████████████▊                                                         | 103/283 [00:39<00:29,  6.12it/s]                                                                                                                                 [2024-03-18 19:19:06] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.34.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 36%|████████████████████████████████▊                                                         | 103/283 [00:39<00:29,  6.12it/s]                                                                                                                                 [2024-03-18 19:19:06] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.34.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 36%|████████████████████████████████▊                                                         | 103/283 [00:39<00:29,  6.12it/s]                                                                                                                                 [2024-03-18 19:19:06] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.35.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 36%|████████████████████████████████▊                                                         | 103/283 [00:39<00:29,  6.12it/s]                                                                                                                                 [2024-03-18 19:19:06] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.35.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 36%|████████████████████████████████▊                                                         | 103/283 [00:40<00:29,  6.12it/s]                                                                                                                                 [2024-03-18 19:19:06] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.35.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 36%|████████████████████████████████▊                                                         | 103/283 [00:40<00:29,  6.12it/s] 37%|█████████████████████████████████▋                                                        | 106/283 [00:40<00:25,  6.95it/s]                                                                                                                                 [2024-03-18 19:19:07] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.35.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 37%|█████████████████████████████████▋                                                        | 106/283 [00:40<00:25,  6.95it/s]                                                                                                                                 [2024-03-18 19:19:07] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.35.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 37%|█████████████████████████████████▋                                                        | 106/283 [00:40<00:25,  6.95it/s] 38%|██████████████████████████████████                                                        | 107/283 [00:40<00:34,  5.06it/s]                                                                                                                                 [2024-03-18 19:19:07] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.35.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 38%|██████████████████████████████████                                                        | 107/283 [00:40<00:34,  5.06it/s]                                                                                                                                 [2024-03-18 19:19:07] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.35.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 38%|██████████████████████████████████                                                        | 107/283 [00:40<00:34,  5.06it/s]                                                                                                                                 [2024-03-18 19:19:07] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.35.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 38%|██████████████████████████████████                                                        | 107/283 [00:40<00:34,  5.06it/s]                                                                                                                                 [2024-03-18 19:19:07] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.35.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 38%|██████████████████████████████████                                                        | 107/283 [00:40<00:34,  5.06it/s] 39%|██████████████████████████████████▉                                                       | 110/283 [00:40<00:28,  6.09it/s]                                                                                                                                 [2024-03-18 19:19:07] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.35.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 39%|██████████████████████████████████▉                                                       | 110/283 [00:40<00:28,  6.09it/s]                                                                                                                                 [2024-03-18 19:19:07] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.35.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 39%|██████████████████████████████████▉                                                       | 110/283 [00:40<00:28,  6.09it/s]                                                                                                                                 [2024-03-18 19:19:07] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.36.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 39%|██████████████████████████████████▉                                                       | 110/283 [00:40<00:28,  6.09it/s]                                                                                                                                 [2024-03-18 19:19:07] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.36.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 39%|██████████████████████████████████▉                                                       | 110/283 [00:41<00:28,  6.09it/s]                                                                                                                                 [2024-03-18 19:19:07] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.36.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 39%|██████████████████████████████████▉                                                       | 110/283 [00:41<00:28,  6.09it/s] 40%|███████████████████████████████████▉                                                      | 113/283 [00:41<00:24,  6.87it/s]                                                                                                                                 [2024-03-18 19:19:08] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.36.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 40%|███████████████████████████████████▉                                                      | 113/283 [00:41<00:24,  6.87it/s]                                                                                                                                 [2024-03-18 19:19:08] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.36.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 40%|███████████████████████████████████▉                                                      | 113/283 [00:41<00:24,  6.87it/s] 40%|████████████████████████████████████▎                                                     | 114/283 [00:41<00:34,  4.96it/s]                                                                                                                                 [2024-03-18 19:19:08] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.36.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 40%|████████████████████████████████████▎                                                     | 114/283 [00:41<00:34,  4.96it/s]                                                                                                                                 [2024-03-18 19:19:08] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.36.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 40%|████████████████████████████████████▎                                                     | 114/283 [00:41<00:34,  4.96it/s]                                                                                                                                 [2024-03-18 19:19:08] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.36.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 40%|████████████████████████████████████▎                                                     | 114/283 [00:42<00:34,  4.96it/s]                                                                                                                                 [2024-03-18 19:19:08] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.36.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 40%|████████████████████████████████████▎                                                     | 114/283 [00:42<00:34,  4.96it/s] 41%|█████████████████████████████████████▏                                                    | 117/283 [00:42<00:27,  5.99it/s]                                                                                                                                 [2024-03-18 19:19:08] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.36.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 41%|█████████████████████████████████████▏                                                    | 117/283 [00:42<00:27,  5.99it/s]                                                                                                                                 [2024-03-18 19:19:08] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.36.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 41%|█████████████████████████████████████▏                                                    | 117/283 [00:42<00:27,  5.99it/s]                                                                                                                                 [2024-03-18 19:19:08] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.37.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 41%|█████████████████████████████████████▏                                                    | 117/283 [00:42<00:27,  5.99it/s] 42%|█████████████████████████████████████▊                                                    | 119/283 [00:42<00:22,  7.36it/s]                                                                                                                                 [2024-03-18 19:19:09] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.37.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 42%|█████████████████████████████████████▊                                                    | 119/283 [00:42<00:22,  7.36it/s]                                                                                                                                 [2024-03-18 19:19:09] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.37.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 42%|█████████████████████████████████████▊                                                    | 119/283 [00:42<00:22,  7.36it/s]                                                                                                                                 [2024-03-18 19:19:09] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.37.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 42%|█████████████████████████████████████▊                                                    | 119/283 [00:42<00:22,  7.36it/s]                                                                                                                                 [2024-03-18 19:19:09] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.37.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 42%|█████████████████████████████████████▊                                                    | 119/283 [00:42<00:22,  7.36it/s] 43%|██████████████████████████████████████▍                                                   | 121/283 [00:42<00:32,  4.93it/s]                                                                                                                                 [2024-03-18 19:19:09] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.37.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 43%|██████████████████████████████████████▍                                                   | 121/283 [00:42<00:32,  4.93it/s]                                                                                                                                 [2024-03-18 19:19:09] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.37.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 43%|██████████████████████████████████████▍                                                   | 121/283 [00:42<00:32,  4.93it/s]                                                                                                                                 [2024-03-18 19:19:09] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.37.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 43%|██████████████████████████████████████▍                                                   | 121/283 [00:43<00:32,  4.93it/s]                                                                                                                                 [2024-03-18 19:19:10] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.37.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 43%|██████████████████████████████████████▍                                                   | 121/283 [00:43<00:32,  4.93it/s] 44%|███████████████████████████████████████▍                                                  | 124/283 [00:43<00:26,  5.96it/s]                                                                                                                                 [2024-03-18 19:19:10] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.37.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 44%|███████████████████████████████████████▍                                                  | 124/283 [00:43<00:26,  5.96it/s]                                                                                                                                 [2024-03-18 19:19:10] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.37.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 44%|███████████████████████████████████████▍                                                  | 124/283 [00:43<00:26,  5.96it/s]                                                                                                                                 [2024-03-18 19:19:10] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.38.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 44%|███████████████████████████████████████▍                                                  | 124/283 [00:43<00:26,  5.96it/s]                                                                                                                                 [2024-03-18 19:19:10] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.38.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 44%|███████████████████████████████████████▍                                                  | 124/283 [00:43<00:26,  5.96it/s]                                                                                                                                 [2024-03-18 19:19:10] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.38.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 44%|███████████████████████████████████████▍                                                  | 124/283 [00:43<00:26,  5.96it/s] 45%|████████████████████████████████████████▍                                                 | 127/283 [00:43<00:23,  6.77it/s]                                                                                                                                 [2024-03-18 19:19:10] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.38.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 45%|████████████████████████████████████████▍                                                 | 127/283 [00:44<00:23,  6.77it/s]                                                                                                                                 [2024-03-18 19:19:10] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.38.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 45%|████████████████████████████████████████▍                                                 | 127/283 [00:44<00:23,  6.77it/s] 45%|████████████████████████████████████████▋                                                 | 128/283 [00:44<00:31,  4.91it/s]                                                                                                                                 [2024-03-18 19:19:10] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.38.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 45%|████████████████████████████████████████▋                                                 | 128/283 [00:44<00:31,  4.91it/s]                                                                                                                                 [2024-03-18 19:19:10] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.38.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 45%|████████████████████████████████████████▋                                                 | 128/283 [00:44<00:31,  4.91it/s]                                                                                                                                 [2024-03-18 19:19:11] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.38.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 45%|████████████████████████████████████████▋                                                 | 128/283 [00:44<00:31,  4.91it/s]                                                                                                                                 [2024-03-18 19:19:11] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.38.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 45%|████████████████████████████████████████▋                                                 | 128/283 [00:44<00:31,  4.91it/s] 46%|█████████████████████████████████████████▋                                                | 131/283 [00:44<00:26,  5.82it/s]                                                                                                                                 [2024-03-18 19:19:11] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.38.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 46%|█████████████████████████████████████████▋                                                | 131/283 [00:44<00:26,  5.82it/s]                                                                                                                                 [2024-03-18 19:19:11] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.38.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 46%|█████████████████████████████████████████▋                                                | 131/283 [00:44<00:26,  5.82it/s]                                                                                                                                 [2024-03-18 19:19:11] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.39.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 46%|█████████████████████████████████████████▋                                                | 131/283 [00:44<00:26,  5.82it/s]                                                                                                                                 [2024-03-18 19:19:11] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.39.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 46%|█████████████████████████████████████████▋                                                | 131/283 [00:44<00:26,  5.82it/s]                                                                                                                                 [2024-03-18 19:19:11] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.39.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 46%|█████████████████████████████████████████▋                                                | 131/283 [00:44<00:26,  5.82it/s] 47%|██████████████████████████████████████████▌                                               | 134/283 [00:44<00:22,  6.68it/s]                                                                                                                                 [2024-03-18 19:19:12] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.39.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 47%|██████████████████████████████████████████▌                                               | 134/283 [00:45<00:22,  6.68it/s]                                                                                                                                 [2024-03-18 19:19:12] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.39.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 47%|██████████████████████████████████████████▌                                               | 134/283 [00:45<00:22,  6.68it/s] 48%|██████████████████████████████████████████▉                                               | 135/283 [00:45<00:30,  4.87it/s]                                                                                                                                 [2024-03-18 19:19:12] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.39.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 48%|██████████████████████████████████████████▉                                               | 135/283 [00:45<00:30,  4.87it/s]                                                                                                                                 [2024-03-18 19:19:12] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.39.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 48%|██████████████████████████████████████████▉                                               | 135/283 [00:45<00:30,  4.87it/s]                                                                                                                                 [2024-03-18 19:19:12] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.39.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 48%|██████████████████████████████████████████▉                                               | 135/283 [00:45<00:30,  4.87it/s]                                                                                                                                 [2024-03-18 19:19:12] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.39.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 48%|██████████████████████████████████████████▉                                               | 135/283 [00:45<00:30,  4.87it/s] 49%|███████████████████████████████████████████▉                                              | 138/283 [00:45<00:24,  5.95it/s]                                                                                                                                 [2024-03-18 19:19:12] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.39.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 49%|███████████████████████████████████████████▉                                              | 138/283 [00:45<00:24,  5.95it/s]                                                                                                                                 [2024-03-18 19:19:12] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.39.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 49%|███████████████████████████████████████████▉                                              | 138/283 [00:45<00:24,  5.95it/s]                                                                                                                                 [2024-03-18 19:19:12] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.norm.weight[0m", shape: (2560,), dtype: float32
 49%|███████████████████████████████████████████▉                                              | 138/283 [00:45<00:24,  5.95it/s]                                                                                                                                 [2024-03-18 19:19:12] INFO huggingface_loader.py:194: Unloading HF weight file: ../dist/models/Qwen1.5-4B/model-00002-of-00002.safetensors
 49%|███████████████████████████████████████████▉                                              | 138/283 [00:45<00:24,  5.95it/s]                                                                                                                                 [2024-03-18 19:19:13] INFO huggingface_loader.py:182: Loading HF parameters from: ../dist/models/Qwen1.5-4B/model-00001-of-00002.safetensors
 49%|███████████████████████████████████████████▉                                              | 138/283 [00:46<00:24,  5.95it/s]                                                                                                                                 [2024-03-18 19:19:34] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.embed_tokens.q_weight[0m", shape: (151936, 320), dtype: uint32
 49%|███████████████████████████████████████████▉                                              | 138/283 [01:07<00:24,  5.95it/s]                                                                                                                                 [2024-03-18 19:19:35] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.embed_tokens.q_scale[0m", shape: (151936, 64), dtype: float32
 49%|███████████████████████████████████████████▉                                              | 138/283 [01:08<00:24,  5.95it/s] 50%|████████████████████████████████████████████▊                                             | 141/283 [01:08<06:38,  2.81s/it]                                                                                                                                 [2024-03-18 19:19:35] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.0.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 50%|████████████████████████████████████████████▊                                             | 141/283 [01:08<06:38,  2.81s/it]                                                                                                                                 [2024-03-18 19:19:35] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.0.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 50%|████████████████████████████████████████████▊                                             | 141/283 [01:08<06:38,  2.81s/it]                                                                                                                                 [2024-03-18 19:19:35] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.0.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 50%|████████████████████████████████████████████▊                                             | 141/283 [01:08<06:38,  2.81s/it] 51%|█████████████████████████████████████████████▍                                            | 143/283 [01:08<04:58,  2.14s/it]                                                                                                                                 [2024-03-18 19:19:36] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.0.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 51%|█████████████████████████████████████████████▍                                            | 143/283 [01:09<04:58,  2.14s/it]                                                                                                                                 [2024-03-18 19:19:36] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.0.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 51%|█████████████████████████████████████████████▍                                            | 143/283 [01:09<04:58,  2.14s/it] 51%|█████████████████████████████████████████████▊                                            | 144/283 [01:09<04:22,  1.89s/it]                                                                                                                                 [2024-03-18 19:19:36] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.0.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 51%|█████████████████████████████████████████████▊                                            | 144/283 [01:09<04:22,  1.89s/it]                                                                                                                                 [2024-03-18 19:19:36] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.0.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 51%|█████████████████████████████████████████████▊                                            | 144/283 [01:09<04:22,  1.89s/it]                                                                                                                                 [2024-03-18 19:19:36] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.0.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 51%|█████████████████████████████████████████████▊                                            | 144/283 [01:09<04:22,  1.89s/it]                                                                                                                                 [2024-03-18 19:19:36] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.0.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 51%|█████████████████████████████████████████████▊                                            | 144/283 [01:09<04:22,  1.89s/it] 52%|██████████████████████████████████████████████▋                                           | 147/283 [01:09<02:41,  1.19s/it]                                                                                                                                 [2024-03-18 19:19:36] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.0.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 52%|██████████████████████████████████████████████▋                                           | 147/283 [01:09<02:41,  1.19s/it]                                                                                                                                 [2024-03-18 19:19:36] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.0.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 52%|██████████████████████████████████████████████▋                                           | 147/283 [01:09<02:41,  1.19s/it]                                                                                                                                 [2024-03-18 19:19:36] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.1.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 52%|██████████████████████████████████████████████▋                                           | 147/283 [01:09<02:41,  1.19s/it]                                                                                                                                 [2024-03-18 19:19:36] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.1.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 52%|██████████████████████████████████████████████▋                                           | 147/283 [01:10<02:41,  1.19s/it]                                                                                                                                 [2024-03-18 19:19:36] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.1.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 52%|██████████████████████████████████████████████▋                                           | 147/283 [01:10<02:41,  1.19s/it] 53%|███████████████████████████████████████████████▋                                          | 150/283 [01:10<01:46,  1.25it/s]                                                                                                                                 [2024-03-18 19:19:37] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.1.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 53%|███████████████████████████████████████████████▋                                          | 150/283 [01:10<01:46,  1.25it/s]                                                                                                                                 [2024-03-18 19:19:37] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.1.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 53%|███████████████████████████████████████████████▋                                          | 150/283 [01:10<01:46,  1.25it/s] 53%|████████████████████████████████████████████████                                          | 151/283 [01:10<01:40,  1.32it/s]                                                                                                                                 [2024-03-18 19:19:37] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.1.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 53%|████████████████████████████████████████████████                                          | 151/283 [01:10<01:40,  1.32it/s]                                                                                                                                 [2024-03-18 19:19:37] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.1.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 53%|████████████████████████████████████████████████                                          | 151/283 [01:10<01:40,  1.32it/s]                                                                                                                                 [2024-03-18 19:19:37] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.1.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 53%|████████████████████████████████████████████████                                          | 151/283 [01:10<01:40,  1.32it/s]                                                                                                                                 [2024-03-18 19:19:37] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.1.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 53%|████████████████████████████████████████████████                                          | 151/283 [01:10<01:40,  1.32it/s] 54%|████████████████████████████████████████████████▉                                         | 154/283 [01:10<01:05,  1.97it/s]                                                                                                                                 [2024-03-18 19:19:37] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.1.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 54%|████████████████████████████████████████████████▉                                         | 154/283 [01:11<01:05,  1.97it/s]                                                                                                                                 [2024-03-18 19:19:37] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.1.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 54%|████████████████████████████████████████████████▉                                         | 154/283 [01:11<01:05,  1.97it/s]                                                                                                                                 [2024-03-18 19:19:37] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.10.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 54%|████████████████████████████████████████████████▉                                         | 154/283 [01:11<01:05,  1.97it/s]                                                                                                                                 [2024-03-18 19:19:38] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.10.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 54%|████████████████████████████████████████████████▉                                         | 154/283 [01:11<01:05,  1.97it/s]                                                                                                                                 [2024-03-18 19:19:38] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.10.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 54%|████████████████████████████████████████████████▉                                         | 154/283 [01:11<01:05,  1.97it/s] 55%|█████████████████████████████████████████████████▉                                        | 157/283 [01:11<00:46,  2.74it/s]                                                                                                                                 [2024-03-18 19:19:38] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.10.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 55%|█████████████████████████████████████████████████▉                                        | 157/283 [01:11<00:46,  2.74it/s]                                                                                                                                 [2024-03-18 19:19:38] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.10.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 55%|█████████████████████████████████████████████████▉                                        | 157/283 [01:11<00:46,  2.74it/s] 56%|██████████████████████████████████████████████████▏                                       | 158/283 [01:11<00:48,  2.58it/s]                                                                                                                                 [2024-03-18 19:19:38] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.10.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 56%|██████████████████████████████████████████████████▏                                       | 158/283 [01:11<00:48,  2.58it/s]                                                                                                                                 [2024-03-18 19:19:38] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.10.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 56%|██████████████████████████████████████████████████▏                                       | 158/283 [01:11<00:48,  2.58it/s]                                                                                                                                 [2024-03-18 19:19:38] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.10.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 56%|██████████████████████████████████████████████████▏                                       | 158/283 [01:12<00:48,  2.58it/s]                                                                                                                                 [2024-03-18 19:19:38] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.10.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 56%|██████████████████████████████████████████████████▏                                       | 158/283 [01:12<00:48,  2.58it/s] 57%|███████████████████████████████████████████████████▏                                      | 161/283 [01:12<00:34,  3.56it/s]                                                                                                                                 [2024-03-18 19:19:39] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.10.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 57%|███████████████████████████████████████████████████▏                                      | 161/283 [01:12<00:34,  3.56it/s]                                                                                                                                 [2024-03-18 19:19:39] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.10.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 57%|███████████████████████████████████████████████████▏                                      | 161/283 [01:12<00:34,  3.56it/s]                                                                                                                                 [2024-03-18 19:19:39] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.11.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 57%|███████████████████████████████████████████████████▏                                      | 161/283 [01:12<00:34,  3.56it/s]                                                                                                                                 [2024-03-18 19:19:39] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.11.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 57%|███████████████████████████████████████████████████▏                                      | 161/283 [01:12<00:34,  3.56it/s]                                                                                                                                 [2024-03-18 19:19:39] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.11.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 57%|███████████████████████████████████████████████████▏                                      | 161/283 [01:12<00:34,  3.56it/s] 58%|████████████████████████████████████████████████████▏                                     | 164/283 [01:12<00:26,  4.57it/s]                                                                                                                                 [2024-03-18 19:19:39] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.11.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 58%|████████████████████████████████████████████████████▏                                     | 164/283 [01:12<00:26,  4.57it/s]                                                                                                                                 [2024-03-18 19:19:39] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.11.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 58%|████████████████████████████████████████████████████▏                                     | 164/283 [01:12<00:26,  4.57it/s] 58%|████████████████████████████████████████████████████▍                                     | 165/283 [01:12<00:30,  3.83it/s]                                                                                                                                 [2024-03-18 19:19:39] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.11.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 58%|████████████████████████████████████████████████████▍                                     | 165/283 [01:12<00:30,  3.83it/s]                                                                                                                                 [2024-03-18 19:19:39] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.11.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 58%|████████████████████████████████████████████████████▍                                     | 165/283 [01:12<00:30,  3.83it/s]                                                                                                                                 [2024-03-18 19:19:40] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.11.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 58%|████████████████████████████████████████████████████▍                                     | 165/283 [01:13<00:30,  3.83it/s]                                                                                                                                 [2024-03-18 19:19:40] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.11.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 58%|████████████████████████████████████████████████████▍                                     | 165/283 [01:13<00:30,  3.83it/s] 59%|█████████████████████████████████████████████████████▍                                    | 168/283 [01:13<00:23,  4.94it/s]                                                                                                                                 [2024-03-18 19:19:40] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.11.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 59%|█████████████████████████████████████████████████████▍                                    | 168/283 [01:13<00:23,  4.94it/s]                                                                                                                                 [2024-03-18 19:19:40] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.11.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 59%|█████████████████████████████████████████████████████▍                                    | 168/283 [01:13<00:23,  4.94it/s]                                                                                                                                 [2024-03-18 19:19:40] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.12.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 59%|█████████████████████████████████████████████████████▍                                    | 168/283 [01:13<00:23,  4.94it/s]                                                                                                                                 [2024-03-18 19:19:40] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.12.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 59%|█████████████████████████████████████████████████████▍                                    | 168/283 [01:13<00:23,  4.94it/s]                                                                                                                                 [2024-03-18 19:19:40] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.12.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 59%|█████████████████████████████████████████████████████▍                                    | 168/283 [01:13<00:23,  4.94it/s] 60%|██████████████████████████████████████████████████████▍                                   | 171/283 [01:13<00:18,  5.90it/s]                                                                                                                                 [2024-03-18 19:19:40] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.12.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 60%|██████████████████████████████████████████████████████▍                                   | 171/283 [01:14<00:18,  5.90it/s]                                                                                                                                 [2024-03-18 19:19:40] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.12.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 60%|██████████████████████████████████████████████████████▍                                   | 171/283 [01:14<00:18,  5.90it/s] 61%|██████████████████████████████████████████████████████▋                                   | 172/283 [01:14<00:24,  4.57it/s]                                                                                                                                 [2024-03-18 19:19:40] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.12.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 61%|██████████████████████████████████████████████████████▋                                   | 172/283 [01:14<00:24,  4.57it/s]                                                                                                                                 [2024-03-18 19:19:40] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.12.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 61%|██████████████████████████████████████████████████████▋                                   | 172/283 [01:14<00:24,  4.57it/s]                                                                                                                                 [2024-03-18 19:19:41] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.12.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 61%|██████████████████████████████████████████████████████▋                                   | 172/283 [01:14<00:24,  4.57it/s]                                                                                                                                 [2024-03-18 19:19:41] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.12.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 61%|██████████████████████████████████████████████████████▋                                   | 172/283 [01:14<00:24,  4.57it/s] 62%|███████████████████████████████████████████████████████▋                                  | 175/283 [01:14<00:19,  5.66it/s]                                                                                                                                 [2024-03-18 19:19:41] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.12.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 62%|███████████████████████████████████████████████████████▋                                  | 175/283 [01:14<00:19,  5.66it/s]                                                                                                                                 [2024-03-18 19:19:41] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.12.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 62%|███████████████████████████████████████████████████████▋                                  | 175/283 [01:14<00:19,  5.66it/s]                                                                                                                                 [2024-03-18 19:19:41] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.13.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 62%|███████████████████████████████████████████████████████▋                                  | 175/283 [01:14<00:19,  5.66it/s]                                                                                                                                 [2024-03-18 19:19:41] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.13.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 62%|███████████████████████████████████████████████████████▋                                  | 175/283 [01:14<00:19,  5.66it/s]                                                                                                                                 [2024-03-18 19:19:41] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.13.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 62%|███████████████████████████████████████████████████████▋                                  | 175/283 [01:14<00:19,  5.66it/s] 63%|████████████████████████████████████████████████████████▌                                 | 178/283 [01:14<00:15,  6.57it/s]                                                                                                                                 [2024-03-18 19:19:41] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.13.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 63%|████████████████████████████████████████████████████████▌                                 | 178/283 [01:15<00:15,  6.57it/s]                                                                                                                                 [2024-03-18 19:19:42] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.13.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 63%|████████████████████████████████████████████████████████▌                                 | 178/283 [01:15<00:15,  6.57it/s] 63%|████████████████████████████████████████████████████████▉                                 | 179/283 [01:15<00:21,  4.90it/s]                                                                                                                                 [2024-03-18 19:19:42] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.13.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 63%|████████████████████████████████████████████████████████▉                                 | 179/283 [01:15<00:21,  4.90it/s]                                                                                                                                 [2024-03-18 19:19:42] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.13.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 63%|████████████████████████████████████████████████████████▉                                 | 179/283 [01:15<00:21,  4.90it/s]                                                                                                                                 [2024-03-18 19:19:42] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.13.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 63%|████████████████████████████████████████████████████████▉                                 | 179/283 [01:15<00:21,  4.90it/s]                                                                                                                                 [2024-03-18 19:19:42] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.13.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 63%|████████████████████████████████████████████████████████▉                                 | 179/283 [01:15<00:21,  4.90it/s] 64%|█████████████████████████████████████████████████████████▉                                | 182/283 [01:15<00:16,  6.00it/s]                                                                                                                                 [2024-03-18 19:19:42] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.13.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 64%|█████████████████████████████████████████████████████████▉                                | 182/283 [01:15<00:16,  6.00it/s]                                                                                                                                 [2024-03-18 19:19:42] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.13.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 64%|█████████████████████████████████████████████████████████▉                                | 182/283 [01:15<00:16,  6.00it/s]                                                                                                                                 [2024-03-18 19:19:42] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.14.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 64%|█████████████████████████████████████████████████████████▉                                | 182/283 [01:15<00:16,  6.00it/s]                                                                                                                                 [2024-03-18 19:19:42] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.14.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 64%|█████████████████████████████████████████████████████████▉                                | 182/283 [01:15<00:16,  6.00it/s]                                                                                                                                 [2024-03-18 19:19:42] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.14.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 64%|█████████████████████████████████████████████████████████▉                                | 182/283 [01:15<00:16,  6.00it/s] 65%|██████████████████████████████████████████████████████████▊                               | 185/283 [01:15<00:14,  6.87it/s]                                                                                                                                 [2024-03-18 19:19:43] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.14.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 65%|██████████████████████████████████████████████████████████▊                               | 185/283 [01:16<00:14,  6.87it/s]                                                                                                                                 [2024-03-18 19:19:43] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.14.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 65%|██████████████████████████████████████████████████████████▊                               | 185/283 [01:16<00:14,  6.87it/s] 66%|███████████████████████████████████████████████████████████▏                              | 186/283 [01:16<00:19,  5.04it/s]                                                                                                                                 [2024-03-18 19:19:43] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.14.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 66%|███████████████████████████████████████████████████████████▏                              | 186/283 [01:16<00:19,  5.04it/s]                                                                                                                                 [2024-03-18 19:19:43] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.14.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 66%|███████████████████████████████████████████████████████████▏                              | 186/283 [01:16<00:19,  5.04it/s]                                                                                                                                 [2024-03-18 19:19:43] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.14.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 66%|███████████████████████████████████████████████████████████▏                              | 186/283 [01:16<00:19,  5.04it/s]                                                                                                                                 [2024-03-18 19:19:43] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.14.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 66%|███████████████████████████████████████████████████████████▏                              | 186/283 [01:16<00:19,  5.04it/s] 67%|████████████████████████████████████████████████████████████                              | 189/283 [01:16<00:15,  6.12it/s]                                                                                                                                 [2024-03-18 19:19:43] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.14.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 67%|████████████████████████████████████████████████████████████                              | 189/283 [01:16<00:15,  6.12it/s]                                                                                                                                 [2024-03-18 19:19:43] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.14.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 67%|████████████████████████████████████████████████████████████                              | 189/283 [01:16<00:15,  6.12it/s]                                                                                                                                 [2024-03-18 19:19:43] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.15.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 67%|████████████████████████████████████████████████████████████                              | 189/283 [01:16<00:15,  6.12it/s]                                                                                                                                 [2024-03-18 19:19:43] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.15.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 67%|████████████████████████████████████████████████████████████                              | 189/283 [01:17<00:15,  6.12it/s]                                                                                                                                 [2024-03-18 19:19:43] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.15.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 67%|████████████████████████████████████████████████████████████                              | 189/283 [01:17<00:15,  6.12it/s] 68%|█████████████████████████████████████████████████████████████                             | 192/283 [01:17<00:13,  6.98it/s]                                                                                                                                 [2024-03-18 19:19:44] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.15.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 68%|█████████████████████████████████████████████████████████████                             | 192/283 [01:17<00:13,  6.98it/s]                                                                                                                                 [2024-03-18 19:19:44] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.15.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 68%|█████████████████████████████████████████████████████████████                             | 192/283 [01:17<00:13,  6.98it/s] 68%|█████████████████████████████████████████████████████████████▍                            | 193/283 [01:17<00:17,  5.10it/s]                                                                                                                                 [2024-03-18 19:19:44] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.15.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 68%|█████████████████████████████████████████████████████████████▍                            | 193/283 [01:17<00:17,  5.10it/s]                                                                                                                                 [2024-03-18 19:19:44] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.15.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 68%|█████████████████████████████████████████████████████████████▍                            | 193/283 [01:17<00:17,  5.10it/s]                                                                                                                                 [2024-03-18 19:19:44] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.15.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 68%|█████████████████████████████████████████████████████████████▍                            | 193/283 [01:17<00:17,  5.10it/s]                                                                                                                                 [2024-03-18 19:19:44] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.15.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 68%|█████████████████████████████████████████████████████████████▍                            | 193/283 [01:17<00:17,  5.10it/s] 69%|██████████████████████████████████████████████████████████████▎                           | 196/283 [01:17<00:14,  6.18it/s]                                                                                                                                 [2024-03-18 19:19:44] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.15.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 69%|██████████████████████████████████████████████████████████████▎                           | 196/283 [01:18<00:14,  6.18it/s]                                                                                                                                 [2024-03-18 19:19:44] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.15.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 69%|██████████████████████████████████████████████████████████████▎                           | 196/283 [01:18<00:14,  6.18it/s] 70%|██████████████████████████████████████████████████████████████▋                           | 197/283 [01:18<00:13,  6.56it/s]                                                                                                                                 [2024-03-18 19:19:44] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.16.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 70%|██████████████████████████████████████████████████████████████▋                           | 197/283 [01:18<00:13,  6.56it/s]                                                                                                                                 [2024-03-18 19:19:45] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.16.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 70%|██████████████████████████████████████████████████████████████▋                           | 197/283 [01:18<00:13,  6.56it/s]                                                                                                                                 [2024-03-18 19:19:45] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.16.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 70%|██████████████████████████████████████████████████████████████▋                           | 197/283 [01:18<00:13,  6.56it/s] 70%|███████████████████████████████████████████████████████████████▎                          | 199/283 [01:18<00:12,  6.98it/s]                                                                                                                                 [2024-03-18 19:19:45] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.16.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 70%|███████████████████████████████████████████████████████████████▎                          | 199/283 [01:18<00:12,  6.98it/s]                                                                                                                                 [2024-03-18 19:19:45] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.16.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 70%|███████████████████████████████████████████████████████████████▎                          | 199/283 [01:18<00:12,  6.98it/s] 71%|███████████████████████████████████████████████████████████████▌                          | 200/283 [01:18<00:17,  4.76it/s]                                                                                                                                 [2024-03-18 19:19:45] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.16.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 71%|███████████████████████████████████████████████████████████████▌                          | 200/283 [01:18<00:17,  4.76it/s]                                                                                                                                 [2024-03-18 19:19:45] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.16.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 71%|███████████████████████████████████████████████████████████████▌                          | 200/283 [01:18<00:17,  4.76it/s]                                                                                                                                 [2024-03-18 19:19:45] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.16.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 71%|███████████████████████████████████████████████████████████████▌                          | 200/283 [01:19<00:17,  4.76it/s]                                                                                                                                 [2024-03-18 19:19:45] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.16.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 71%|███████████████████████████████████████████████████████████████▌                          | 200/283 [01:19<00:17,  4.76it/s] 72%|████████████████████████████████████████████████████████████████▌                         | 203/283 [01:19<00:13,  6.06it/s]                                                                                                                                 [2024-03-18 19:19:45] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.16.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 72%|████████████████████████████████████████████████████████████████▌                         | 203/283 [01:19<00:13,  6.06it/s]                                                                                                                                 [2024-03-18 19:19:45] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.16.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 72%|████████████████████████████████████████████████████████████████▌                         | 203/283 [01:19<00:13,  6.06it/s]                                                                                                                                 [2024-03-18 19:19:45] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.17.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 72%|████████████████████████████████████████████████████████████████▌                         | 203/283 [01:19<00:13,  6.06it/s]                                                                                                                                 [2024-03-18 19:19:46] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.17.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 72%|████████████████████████████████████████████████████████████████▌                         | 203/283 [01:19<00:13,  6.06it/s]                                                                                                                                 [2024-03-18 19:19:46] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.17.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 72%|████████████████████████████████████████████████████████████████▌                         | 203/283 [01:19<00:13,  6.06it/s] 73%|█████████████████████████████████████████████████████████████████▌                        | 206/283 [01:19<00:11,  6.96it/s]                                                                                                                                 [2024-03-18 19:19:46] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.17.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 73%|█████████████████████████████████████████████████████████████████▌                        | 206/283 [01:19<00:11,  6.96it/s]                                                                                                                                 [2024-03-18 19:19:46] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.17.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 73%|█████████████████████████████████████████████████████████████████▌                        | 206/283 [01:19<00:11,  6.96it/s] 73%|█████████████████████████████████████████████████████████████████▊                        | 207/283 [01:19<00:15,  4.95it/s]                                                                                                                                 [2024-03-18 19:19:46] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.17.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 73%|█████████████████████████████████████████████████████████████████▊                        | 207/283 [01:19<00:15,  4.95it/s]                                                                                                                                 [2024-03-18 19:19:46] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.17.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 73%|█████████████████████████████████████████████████████████████████▊                        | 207/283 [01:19<00:15,  4.95it/s]                                                                                                                                 [2024-03-18 19:19:46] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.17.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 73%|█████████████████████████████████████████████████████████████████▊                        | 207/283 [01:20<00:15,  4.95it/s]                                                                                                                                 [2024-03-18 19:19:47] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.17.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 73%|█████████████████████████████████████████████████████████████████▊                        | 207/283 [01:20<00:15,  4.95it/s] 74%|██████████████████████████████████████████████████████████████████▊                       | 210/283 [01:20<00:12,  6.07it/s]                                                                                                                                 [2024-03-18 19:19:47] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.17.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 74%|██████████████████████████████████████████████████████████████████▊                       | 210/283 [01:20<00:12,  6.07it/s]                                                                                                                                 [2024-03-18 19:19:47] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.17.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 74%|██████████████████████████████████████████████████████████████████▊                       | 210/283 [01:20<00:12,  6.07it/s]                                                                                                                                 [2024-03-18 19:19:47] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.18.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 74%|██████████████████████████████████████████████████████████████████▊                       | 210/283 [01:20<00:12,  6.07it/s]                                                                                                                                 [2024-03-18 19:19:47] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.18.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 74%|██████████████████████████████████████████████████████████████████▊                       | 210/283 [01:20<00:12,  6.07it/s]                                                                                                                                 [2024-03-18 19:19:47] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.18.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 74%|██████████████████████████████████████████████████████████████████▊                       | 210/283 [01:20<00:12,  6.07it/s] 75%|███████████████████████████████████████████████████████████████████▋                      | 213/283 [01:20<00:10,  6.93it/s]                                                                                                                                 [2024-03-18 19:19:47] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.18.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 75%|███████████████████████████████████████████████████████████████████▋                      | 213/283 [01:21<00:10,  6.93it/s]                                                                                                                                 [2024-03-18 19:19:47] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.18.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 75%|███████████████████████████████████████████████████████████████████▋                      | 213/283 [01:21<00:10,  6.93it/s] 76%|████████████████████████████████████████████████████████████████████                      | 214/283 [01:21<00:13,  5.02it/s]                                                                                                                                 [2024-03-18 19:19:47] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.18.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 76%|████████████████████████████████████████████████████████████████████                      | 214/283 [01:21<00:13,  5.02it/s]                                                                                                                                 [2024-03-18 19:19:47] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.18.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 76%|████████████████████████████████████████████████████████████████████                      | 214/283 [01:21<00:13,  5.02it/s]                                                                                                                                 [2024-03-18 19:19:48] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.18.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 76%|████████████████████████████████████████████████████████████████████                      | 214/283 [01:21<00:13,  5.02it/s]                                                                                                                                 [2024-03-18 19:19:48] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.18.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 76%|████████████████████████████████████████████████████████████████████                      | 214/283 [01:21<00:13,  5.02it/s] 77%|█████████████████████████████████████████████████████████████████████                     | 217/283 [01:21<00:10,  6.14it/s]                                                                                                                                 [2024-03-18 19:19:48] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.18.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 77%|█████████████████████████████████████████████████████████████████████                     | 217/283 [01:21<00:10,  6.14it/s]                                                                                                                                 [2024-03-18 19:19:48] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.18.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 77%|█████████████████████████████████████████████████████████████████████                     | 217/283 [01:21<00:10,  6.14it/s]                                                                                                                                 [2024-03-18 19:19:48] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.19.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 77%|█████████████████████████████████████████████████████████████████████                     | 217/283 [01:21<00:10,  6.14it/s]                                                                                                                                 [2024-03-18 19:19:48] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.19.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 77%|█████████████████████████████████████████████████████████████████████                     | 217/283 [01:21<00:10,  6.14it/s]                                                                                                                                 [2024-03-18 19:19:48] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.19.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 77%|█████████████████████████████████████████████████████████████████████                     | 217/283 [01:21<00:10,  6.14it/s] 78%|█████████████████████████████████████████████████████████████████████▉                    | 220/283 [01:21<00:09,  6.99it/s]                                                                                                                                 [2024-03-18 19:19:48] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.19.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 78%|█████████████████████████████████████████████████████████████████████▉                    | 220/283 [01:22<00:09,  6.99it/s]                                                                                                                                 [2024-03-18 19:19:49] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.19.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 78%|█████████████████████████████████████████████████████████████████████▉                    | 220/283 [01:22<00:09,  6.99it/s] 78%|██████████████████████████████████████████████████████████████████████▎                   | 221/283 [01:22<00:12,  5.08it/s]                                                                                                                                 [2024-03-18 19:19:49] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.19.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 78%|██████████████████████████████████████████████████████████████████████▎                   | 221/283 [01:22<00:12,  5.08it/s]                                                                                                                                 [2024-03-18 19:19:49] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.19.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 78%|██████████████████████████████████████████████████████████████████████▎                   | 221/283 [01:22<00:12,  5.08it/s]                                                                                                                                 [2024-03-18 19:19:49] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.19.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 78%|██████████████████████████████████████████████████████████████████████▎                   | 221/283 [01:22<00:12,  5.08it/s]                                                                                                                                 [2024-03-18 19:19:49] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.19.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 78%|██████████████████████████████████████████████████████████████████████▎                   | 221/283 [01:22<00:12,  5.08it/s] 79%|███████████████████████████████████████████████████████████████████████▏                  | 224/283 [01:22<00:09,  6.17it/s]                                                                                                                                 [2024-03-18 19:19:49] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.19.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 79%|███████████████████████████████████████████████████████████████████████▏                  | 224/283 [01:22<00:09,  6.17it/s]                                                                                                                                 [2024-03-18 19:19:49] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.19.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 79%|███████████████████████████████████████████████████████████████████████▏                  | 224/283 [01:22<00:09,  6.17it/s]                                                                                                                                 [2024-03-18 19:19:49] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.2.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 79%|███████████████████████████████████████████████████████████████████████▏                  | 224/283 [01:22<00:09,  6.17it/s]                                                                                                                                 [2024-03-18 19:19:49] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.2.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 79%|███████████████████████████████████████████████████████████████████████▏                  | 224/283 [01:22<00:09,  6.17it/s]                                                                                                                                 [2024-03-18 19:19:49] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.2.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 79%|███████████████████████████████████████████████████████████████████████▏                  | 224/283 [01:22<00:09,  6.17it/s] 80%|████████████████████████████████████████████████████████████████████████▏                 | 227/283 [01:22<00:07,  7.00it/s]                                                                                                                                 [2024-03-18 19:19:50] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.2.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 80%|████████████████████████████████████████████████████████████████████████▏                 | 227/283 [01:23<00:07,  7.00it/s]                                                                                                                                 [2024-03-18 19:19:50] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.2.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 80%|████████████████████████████████████████████████████████████████████████▏                 | 227/283 [01:23<00:07,  7.00it/s] 81%|████████████████████████████████████████████████████████████████████████▌                 | 228/283 [01:23<00:10,  5.10it/s]                                                                                                                                 [2024-03-18 19:19:50] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.2.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 81%|████████████████████████████████████████████████████████████████████████▌                 | 228/283 [01:23<00:10,  5.10it/s]                                                                                                                                 [2024-03-18 19:19:50] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.2.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 81%|████████████████████████████████████████████████████████████████████████▌                 | 228/283 [01:23<00:10,  5.10it/s]                                                                                                                                 [2024-03-18 19:19:50] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.2.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 81%|████████████████████████████████████████████████████████████████████████▌                 | 228/283 [01:23<00:10,  5.10it/s]                                                                                                                                 [2024-03-18 19:19:50] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.2.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 81%|████████████████████████████████████████████████████████████████████████▌                 | 228/283 [01:23<00:10,  5.10it/s] 82%|█████████████████████████████████████████████████████████████████████████▍                | 231/283 [01:23<00:08,  6.18it/s]                                                                                                                                 [2024-03-18 19:19:50] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.2.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 82%|█████████████████████████████████████████████████████████████████████████▍                | 231/283 [01:23<00:08,  6.18it/s]                                                                                                                                 [2024-03-18 19:19:50] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.2.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 82%|█████████████████████████████████████████████████████████████████████████▍                | 231/283 [01:23<00:08,  6.18it/s]                                                                                                                                 [2024-03-18 19:19:50] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.20.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 82%|█████████████████████████████████████████████████████████████████████████▍                | 231/283 [01:23<00:08,  6.18it/s]                                                                                                                                 [2024-03-18 19:19:50] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.20.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 82%|█████████████████████████████████████████████████████████████████████████▍                | 231/283 [01:24<00:08,  6.18it/s]                                                                                                                                 [2024-03-18 19:19:50] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.20.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 82%|█████████████████████████████████████████████████████████████████████████▍                | 231/283 [01:24<00:08,  6.18it/s] 83%|██████████████████████████████████████████████████████████████████████████▍               | 234/283 [01:24<00:07,  6.83it/s]                                                                                                                                 [2024-03-18 19:19:50] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.3.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 83%|██████████████████████████████████████████████████████████████████████████▍               | 234/283 [01:24<00:07,  6.83it/s]                                                                                                                                 [2024-03-18 19:19:51] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.3.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 83%|██████████████████████████████████████████████████████████████████████████▍               | 234/283 [01:24<00:07,  6.83it/s]                                                                                                                                 [2024-03-18 19:19:51] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.3.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 83%|██████████████████████████████████████████████████████████████████████████▍               | 234/283 [01:24<00:07,  6.83it/s] 83%|███████████████████████████████████████████████████████████████████████████               | 236/283 [01:24<00:06,  6.81it/s]                                                                                                                                 [2024-03-18 19:19:51] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.3.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 83%|███████████████████████████████████████████████████████████████████████████               | 236/283 [01:24<00:06,  6.81it/s]                                                                                                                                 [2024-03-18 19:19:51] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.3.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 83%|███████████████████████████████████████████████████████████████████████████               | 236/283 [01:24<00:06,  6.81it/s] 84%|███████████████████████████████████████████████████████████████████████████▎              | 237/283 [01:24<00:09,  5.05it/s]                                                                                                                                 [2024-03-18 19:19:51] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.3.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 84%|███████████████████████████████████████████████████████████████████████████▎              | 237/283 [01:24<00:09,  5.05it/s]                                                                                                                                 [2024-03-18 19:19:51] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.3.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 84%|███████████████████████████████████████████████████████████████████████████▎              | 237/283 [01:24<00:09,  5.05it/s]                                                                                                                                 [2024-03-18 19:19:51] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.3.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 84%|███████████████████████████████████████████████████████████████████████████▎              | 237/283 [01:25<00:09,  5.05it/s]                                                                                                                                 [2024-03-18 19:19:51] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.3.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 84%|███████████████████████████████████████████████████████████████████████████▎              | 237/283 [01:25<00:09,  5.05it/s] 85%|████████████████████████████████████████████████████████████████████████████▎             | 240/283 [01:25<00:07,  6.11it/s]                                                                                                                                 [2024-03-18 19:19:52] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.3.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 85%|████████████████████████████████████████████████████████████████████████████▎             | 240/283 [01:25<00:07,  6.11it/s]                                                                                                                                 [2024-03-18 19:19:52] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.3.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 85%|████████████████████████████████████████████████████████████████████████████▎             | 240/283 [01:25<00:07,  6.11it/s]                                                                                                                                 [2024-03-18 19:19:52] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.4.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 85%|████████████████████████████████████████████████████████████████████████████▎             | 240/283 [01:25<00:07,  6.11it/s]                                                                                                                                 [2024-03-18 19:19:52] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.4.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 85%|████████████████████████████████████████████████████████████████████████████▎             | 240/283 [01:25<00:07,  6.11it/s]                                                                                                                                 [2024-03-18 19:19:52] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.4.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 85%|████████████████████████████████████████████████████████████████████████████▎             | 240/283 [01:25<00:07,  6.11it/s] 86%|█████████████████████████████████████████████████████████████████████████████▎            | 243/283 [01:25<00:05,  6.99it/s]                                                                                                                                 [2024-03-18 19:19:52] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.4.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 86%|█████████████████████████████████████████████████████████████████████████████▎            | 243/283 [01:25<00:05,  6.99it/s]                                                                                                                                 [2024-03-18 19:19:52] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.4.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 86%|█████████████████████████████████████████████████████████████████████████████▎            | 243/283 [01:26<00:05,  6.99it/s] 86%|█████████████████████████████████████████████████████████████████████████████▌            | 244/283 [01:26<00:07,  5.04it/s]                                                                                                                                 [2024-03-18 19:19:52] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.4.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 86%|█████████████████████████████████████████████████████████████████████████████▌            | 244/283 [01:26<00:07,  5.04it/s]                                                                                                                                 [2024-03-18 19:19:52] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.4.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 86%|█████████████████████████████████████████████████████████████████████████████▌            | 244/283 [01:26<00:07,  5.04it/s]                                                                                                                                 [2024-03-18 19:19:53] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.4.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 86%|█████████████████████████████████████████████████████████████████████████████▌            | 244/283 [01:26<00:07,  5.04it/s]                                                                                                                                 [2024-03-18 19:19:53] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.4.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 86%|█████████████████████████████████████████████████████████████████████████████▌            | 244/283 [01:26<00:07,  5.04it/s] 87%|██████████████████████████████████████████████████████████████████████████████▌           | 247/283 [01:26<00:05,  6.02it/s]                                                                                                                                 [2024-03-18 19:19:53] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.4.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 87%|██████████████████████████████████████████████████████████████████████████████▌           | 247/283 [01:26<00:05,  6.02it/s]                                                                                                                                 [2024-03-18 19:19:53] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.4.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 87%|██████████████████████████████████████████████████████████████████████████████▌           | 247/283 [01:26<00:05,  6.02it/s]                                                                                                                                 [2024-03-18 19:19:53] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.5.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 87%|██████████████████████████████████████████████████████████████████████████████▌           | 247/283 [01:26<00:05,  6.02it/s]                                                                                                                                 [2024-03-18 19:19:53] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.5.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 87%|██████████████████████████████████████████████████████████████████████████████▌           | 247/283 [01:26<00:05,  6.02it/s]                                                                                                                                 [2024-03-18 19:19:53] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.5.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 87%|██████████████████████████████████████████████████████████████████████████████▌           | 247/283 [01:26<00:05,  6.02it/s] 88%|███████████████████████████████████████████████████████████████████████████████▌          | 250/283 [01:26<00:04,  6.89it/s]                                                                                                                                 [2024-03-18 19:19:53] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.5.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 88%|███████████████████████████████████████████████████████████████████████████████▌          | 250/283 [01:27<00:04,  6.89it/s]                                                                                                                                 [2024-03-18 19:19:54] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.5.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 88%|███████████████████████████████████████████████████████████████████████████████▌          | 250/283 [01:27<00:04,  6.89it/s] 89%|███████████████████████████████████████████████████████████████████████████████▊          | 251/283 [01:27<00:06,  5.03it/s]                                                                                                                                 [2024-03-18 19:19:54] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.5.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 89%|███████████████████████████████████████████████████████████████████████████████▊          | 251/283 [01:27<00:06,  5.03it/s]                                                                                                                                 [2024-03-18 19:19:54] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.5.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 89%|███████████████████████████████████████████████████████████████████████████████▊          | 251/283 [01:27<00:06,  5.03it/s]                                                                                                                                 [2024-03-18 19:19:54] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.5.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 89%|███████████████████████████████████████████████████████████████████████████████▊          | 251/283 [01:27<00:06,  5.03it/s]                                                                                                                                 [2024-03-18 19:19:54] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.5.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 89%|███████████████████████████████████████████████████████████████████████████████▊          | 251/283 [01:27<00:06,  5.03it/s] 90%|████████████████████████████████████████████████████████████████████████████████▊         | 254/283 [01:27<00:04,  6.12it/s]                                                                                                                                 [2024-03-18 19:19:54] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.5.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 90%|████████████████████████████████████████████████████████████████████████████████▊         | 254/283 [01:27<00:04,  6.12it/s]                                                                                                                                 [2024-03-18 19:19:54] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.5.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 90%|████████████████████████████████████████████████████████████████████████████████▊         | 254/283 [01:27<00:04,  6.12it/s]                                                                                                                                 [2024-03-18 19:19:54] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.6.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 90%|████████████████████████████████████████████████████████████████████████████████▊         | 254/283 [01:27<00:04,  6.12it/s]                                                                                                                                 [2024-03-18 19:19:54] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.6.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 90%|████████████████████████████████████████████████████████████████████████████████▊         | 254/283 [01:27<00:04,  6.12it/s]                                                                                                                                 [2024-03-18 19:19:54] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.6.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 90%|████████████████████████████████████████████████████████████████████████████████▊         | 254/283 [01:27<00:04,  6.12it/s] 91%|█████████████████████████████████████████████████████████████████████████████████▋        | 257/283 [01:27<00:03,  6.90it/s]                                                                                                                                 [2024-03-18 19:19:55] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.6.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 91%|█████████████████████████████████████████████████████████████████████████████████▋        | 257/283 [01:28<00:03,  6.90it/s]                                                                                                                                 [2024-03-18 19:19:55] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.6.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 91%|█████████████████████████████████████████████████████████████████████████████████▋        | 257/283 [01:28<00:03,  6.90it/s] 91%|██████████████████████████████████████████████████████████████████████████████████        | 258/283 [01:28<00:04,  5.05it/s]                                                                                                                                 [2024-03-18 19:19:55] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.6.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 91%|██████████████████████████████████████████████████████████████████████████████████        | 258/283 [01:28<00:04,  5.05it/s]                                                                                                                                 [2024-03-18 19:19:55] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.6.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 91%|██████████████████████████████████████████████████████████████████████████████████        | 258/283 [01:28<00:04,  5.05it/s]                                                                                                                                 [2024-03-18 19:19:55] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.6.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 91%|██████████████████████████████████████████████████████████████████████████████████        | 258/283 [01:28<00:04,  5.05it/s]                                                                                                                                 [2024-03-18 19:19:55] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.6.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 91%|██████████████████████████████████████████████████████████████████████████████████        | 258/283 [01:28<00:04,  5.05it/s] 92%|███████████████████████████████████████████████████████████████████████████████████       | 261/283 [01:28<00:03,  6.13it/s]                                                                                                                                 [2024-03-18 19:19:55] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.6.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 92%|███████████████████████████████████████████████████████████████████████████████████       | 261/283 [01:28<00:03,  6.13it/s]                                                                                                                                 [2024-03-18 19:19:55] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.6.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 92%|███████████████████████████████████████████████████████████████████████████████████       | 261/283 [01:28<00:03,  6.13it/s]                                                                                                                                 [2024-03-18 19:19:55] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.7.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 92%|███████████████████████████████████████████████████████████████████████████████████       | 261/283 [01:28<00:03,  6.13it/s]                                                                                                                                 [2024-03-18 19:19:55] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.7.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 92%|███████████████████████████████████████████████████████████████████████████████████       | 261/283 [01:29<00:03,  6.13it/s]                                                                                                                                 [2024-03-18 19:19:55] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.7.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 92%|███████████████████████████████████████████████████████████████████████████████████       | 261/283 [01:29<00:03,  6.13it/s] 93%|███████████████████████████████████████████████████████████████████████████████████▉      | 264/283 [01:29<00:02,  6.96it/s]                                                                                                                                 [2024-03-18 19:19:56] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.7.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 93%|███████████████████████████████████████████████████████████████████████████████████▉      | 264/283 [01:29<00:02,  6.96it/s]                                                                                                                                 [2024-03-18 19:19:56] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.7.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 93%|███████████████████████████████████████████████████████████████████████████████████▉      | 264/283 [01:29<00:02,  6.96it/s] 94%|████████████████████████████████████████████████████████████████████████████████████▎     | 265/283 [01:29<00:03,  5.10it/s]                                                                                                                                 [2024-03-18 19:19:56] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.7.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 94%|████████████████████████████████████████████████████████████████████████████████████▎     | 265/283 [01:29<00:03,  5.10it/s]                                                                                                                                 [2024-03-18 19:19:56] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.7.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 94%|████████████████████████████████████████████████████████████████████████████████████▎     | 265/283 [01:29<00:03,  5.10it/s]                                                                                                                                 [2024-03-18 19:19:56] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.7.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 94%|████████████████████████████████████████████████████████████████████████████████████▎     | 265/283 [01:29<00:03,  5.10it/s]                                                                                                                                 [2024-03-18 19:19:56] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.7.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 94%|████████████████████████████████████████████████████████████████████████████████████▎     | 265/283 [01:29<00:03,  5.10it/s] 95%|█████████████████████████████████████████████████████████████████████████████████████▏    | 268/283 [01:29<00:02,  6.18it/s]                                                                                                                                 [2024-03-18 19:19:56] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.7.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 95%|█████████████████████████████████████████████████████████████████████████████████████▏    | 268/283 [01:29<00:02,  6.18it/s]                                                                                                                                 [2024-03-18 19:19:56] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.7.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 95%|█████████████████████████████████████████████████████████████████████████████████████▏    | 268/283 [01:29<00:02,  6.18it/s]                                                                                                                                 [2024-03-18 19:19:56] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.8.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 95%|█████████████████████████████████████████████████████████████████████████████████████▏    | 268/283 [01:29<00:02,  6.18it/s]                                                                                                                                 [2024-03-18 19:19:56] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.8.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 95%|█████████████████████████████████████████████████████████████████████████████████████▏    | 268/283 [01:30<00:02,  6.18it/s]                                                                                                                                 [2024-03-18 19:19:56] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.8.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 95%|█████████████████████████████████████████████████████████████████████████████████████▏    | 268/283 [01:30<00:02,  6.18it/s] 96%|██████████████████████████████████████████████████████████████████████████████████████▏   | 271/283 [01:30<00:01,  7.01it/s]                                                                                                                                 [2024-03-18 19:19:57] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.8.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 96%|██████████████████████████████████████████████████████████████████████████████████████▏   | 271/283 [01:30<00:01,  7.01it/s]                                                                                                                                 [2024-03-18 19:19:57] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.8.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 96%|██████████████████████████████████████████████████████████████████████████████████████▏   | 271/283 [01:30<00:01,  7.01it/s] 96%|██████████████████████████████████████████████████████████████████████████████████████▌   | 272/283 [01:30<00:02,  5.08it/s]                                                                                                                                 [2024-03-18 19:19:57] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.8.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 96%|██████████████████████████████████████████████████████████████████████████████████████▌   | 272/283 [01:30<00:02,  5.08it/s]                                                                                                                                 [2024-03-18 19:19:57] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.8.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 96%|██████████████████████████████████████████████████████████████████████████████████████▌   | 272/283 [01:30<00:02,  5.08it/s]                                                                                                                                 [2024-03-18 19:19:57] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.8.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 96%|██████████████████████████████████████████████████████████████████████████████████████▌   | 272/283 [01:30<00:02,  5.08it/s]                                                                                                                                 [2024-03-18 19:19:57] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.8.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 96%|██████████████████████████████████████████████████████████████████████████████████████▌   | 272/283 [01:31<00:02,  5.08it/s] 97%|███████████████████████████████████████████████████████████████████████████████████████▍  | 275/283 [01:31<00:01,  6.12it/s]                                                                                                                                 [2024-03-18 19:19:57] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.8.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
 97%|███████████████████████████████████████████████████████████████████████████████████████▍  | 275/283 [01:31<00:01,  6.12it/s]                                                                                                                                 [2024-03-18 19:19:57] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.8.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
 97%|███████████████████████████████████████████████████████████████████████████████████████▍  | 275/283 [01:31<00:01,  6.12it/s]                                                                                                                                 [2024-03-18 19:19:57] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.9.input_layernorm.weight[0m", shape: (2560,), dtype: float32
 97%|███████████████████████████████████████████████████████████████████████████████████████▍  | 275/283 [01:31<00:01,  6.12it/s]                                                                                                                                 [2024-03-18 19:19:58] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.9.mlp.down_proj.q_weight[0m", shape: (2560, 865), dtype: uint32
 97%|███████████████████████████████████████████████████████████████████████████████████████▍  | 275/283 [01:31<00:01,  6.12it/s]                                                                                                                                 [2024-03-18 19:19:58] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.9.mlp.down_proj.q_scale[0m", shape: (2560, 173), dtype: float32
 97%|███████████████████████████████████████████████████████████████████████████████████████▍  | 275/283 [01:31<00:01,  6.12it/s] 98%|████████████████████████████████████████████████████████████████████████████████████████▍ | 278/283 [01:31<00:00,  6.95it/s]                                                                                                                                 [2024-03-18 19:19:58] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.9.mlp.gate_up_proj.q_weight[0m", shape: (13824, 320), dtype: uint32
 98%|████████████████████████████████████████████████████████████████████████████████████████▍ | 278/283 [01:31<00:00,  6.95it/s]                                                                                                                                 [2024-03-18 19:19:58] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.9.mlp.gate_up_proj.q_scale[0m", shape: (13824, 64), dtype: float32
 98%|████████████████████████████████████████████████████████████████████████████████████████▍ | 278/283 [01:31<00:00,  6.95it/s] 99%|████████████████████████████████████████████████████████████████████████████████████████▋ | 279/283 [01:31<00:00,  5.09it/s]                                                                                                                                 [2024-03-18 19:19:58] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.9.post_attention_layernorm.weight[0m", shape: (2560,), dtype: float32
 99%|████████████████████████████████████████████████████████████████████████████████████████▋ | 279/283 [01:31<00:00,  5.09it/s]                                                                                                                                 [2024-03-18 19:19:58] INFO huggingface_loader.py:172: [Not quantized] Parameter: "[1mmodel.layers.9.self_attn.c_attn.bias[0m", shape: (7680,), dtype: float32
 99%|████████████████████████████████████████████████████████████████████████████████████████▋ | 279/283 [01:31<00:00,  5.09it/s]                                                                                                                                 [2024-03-18 19:19:58] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.9.self_attn.c_attn.q_weight[0m", shape: (7680, 320), dtype: uint32
 99%|████████████████████████████████████████████████████████████████████████████████████████▋ | 279/283 [01:32<00:00,  5.09it/s]                                                                                                                                 [2024-03-18 19:19:58] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.9.self_attn.c_attn.q_scale[0m", shape: (7680, 64), dtype: float32
 99%|████████████████████████████████████████████████████████████████████████████████████████▋ | 279/283 [01:32<00:00,  5.09it/s]100%|█████████████████████████████████████████████████████████████████████████████████████████▋| 282/283 [01:32<00:00,  6.17it/s]                                                                                                                                 [2024-03-18 19:19:59] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.9.self_attn.o_proj.q_weight[0m", shape: (2560, 320), dtype: uint32
100%|█████████████████████████████████████████████████████████████████████████████████████████▋| 282/283 [01:32<00:00,  6.17it/s]                                                                                                                                 [2024-03-18 19:19:59] INFO huggingface_loader.py:164: [Quantized] Parameter: "[1mmodel.layers.9.self_attn.o_proj.q_scale[0m", shape: (2560, 64), dtype: float32
100%|█████████████████████████████████████████████████████████████████████████████████████████▋| 282/283 [01:32<00:00,  6.17it/s]100%|██████████████████████████████████████████████████████████████████████████████████████████| 283/283 [01:32<00:00,  3.07it/s]
[2024-03-18 19:19:59] INFO huggingface_loader.py:194: Unloading HF weight file: ../dist/models/Qwen1.5-4B/model-00001-of-00002.safetensors
[2024-03-18 19:19:59] INFO stats.py:76: [92mTime usage[0m: HF loading: 25.268 sec; Pre-quantization mapping: 9.640 sec; Quantization: 3.597 sec
[2024-03-18 19:19:59] INFO stats.py:90: [92mRAM usage[0m: Peak RAM: 7.431 GB. Total bytes loaded from disk: 14.716 GB
[2024-03-18 19:19:59] INFO convert_weight.py:156: [92mParameter size[0m after quantization: 2.210 GB
[2024-03-18 19:19:59] INFO convert_weight.py:161: [92mTotal parameters[0m: 3,950,369,280
[2024-03-18 19:19:59] INFO convert_weight.py:162: [92mBits per parameter[0m: 4.805
[2024-03-18 19:19:59] INFO convert_weight.py:167: Saved to directory: [1m/tmp/tmp78htwu3y[0m

All finished, 83 total shards committed, record saved to /tmp/tmp78htwu3y/ndarray-cache.json
Also saved a bf16 record to /tmp/tmp78htwu3y/ndarray-cache-b16.json