2023-11-02 18:06:22.945 | INFO | mmgpt.model.builder:build_model_tokenizer:85 - LlamaTokenizer(name_or_path='/data/hypertext/yuangpeng/huggingface_cache/models--lmsys--vicuna-7b-v15', vocab_size=32000, model_max_length=2048, is_fast=False, padding_side='right', truncation_side='right', special_tokens={'bos_token': AddedToken("", rstrip=False, lstrip=False, single_word=False, normalized=False), 'eos_token': AddedToken("", rstrip=False, lstrip=False, single_word=False, normalized=False), 'unk_token': AddedToken("", rstrip=False, lstrip=False, single_word=False, normalized=False), 'pad_token': ''}, clean_up_tokenization_spaces=False) 2023-11-02 18:06:29.149 | INFO | mmgpt.model.mmgpt.base_mmgpt:build_vision_tokenizer:52 - CLIPImageProcessor { "crop_size": { "height": 448, "width": 448 }, "do_center_crop": true, "do_convert_rgb": true, "do_normalize": true, "do_rescale": true, "do_resize": true, "feature_extractor_type": "CLIPFeatureExtractor", "image_mean": [ 0.48145466, 0.4578275, 0.40821073 ], "image_processor_type": "CLIPImageProcessor", "image_std": [ 0.26862954, 0.26130258, 0.27577711 ], "resample": 3, "rescale_factor": 0.00392156862745098, "size": { "shortest_edge": 448 } } 2023-11-02 18:06:35.538 | INFO | mmgpt.model.mmgpt.base_mmgpt:build_vision_tokenizer:64 - 2 new tokens are added to be trained. 2023-11-02 18:06:35.698 | INFO | mmgpt.model.builder:build_model_tokenizer:148 - MMGPTLlamaForCausalLM( (model): MMGPTLlamaModel( (embed_tokens): Embedding(32003, 4096) (layers): ModuleList( (0-31): 32 x LlamaDecoderLayer( (self_attn): LlamaAttention( (q_proj): Linear(in_features=4096, out_features=4096, bias=False) (k_proj): Linear(in_features=4096, out_features=4096, bias=False) (v_proj): Linear(in_features=4096, out_features=4096, bias=False) (o_proj): Linear(in_features=4096, out_features=4096, bias=False) (rotary_emb): LlamaRotaryEmbedding() ) (mlp): LlamaMLP( (gate_proj): Linear(in_features=4096, out_features=11008, bias=False) (up_proj): Linear(in_features=4096, out_features=11008, bias=False) (down_proj): Linear(in_features=11008, out_features=4096, bias=False) (act_fn): SiLUActivation() ) (input_layernorm): LlamaRMSNorm() (post_attention_layernorm): LlamaRMSNorm() ) ) (norm): LlamaRMSNorm() (vision_tower): CLIPVisionTower( (vision_tower): CLIPVisionModel( (vision_model): CLIPVisionTransformer( (embeddings): CLIPVisionEmbeddings( (patch_embedding): Conv2d(3, 1024, kernel_size=(14, 14), stride=(14, 14), bias=False) (position_embedding): Embedding(1025, 1024) ) (pre_layrnorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (encoder): CLIPEncoder( (layers): ModuleList( (0-23): 24 x CLIPEncoderLayer( (self_attn): CLIPAttention( (k_proj): Linear(in_features=1024, out_features=1024, bias=True) (v_proj): Linear(in_features=1024, out_features=1024, bias=True) (q_proj): Linear(in_features=1024, out_features=1024, bias=True) (out_proj): Linear(in_features=1024, out_features=1024, bias=True) ) (layer_norm1): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (mlp): CLIPMLP( (activation_fn): QuickGELUActivation() (fc1): Linear(in_features=1024, out_features=4096, bias=True) (fc2): Linear(in_features=4096, out_features=1024, bias=True) ) (layer_norm2): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) ) ) ) (post_layernorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) ) ) ) (projector): ConvProjector( (projector): Conv2d(1024, 4096, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) ) ) (lm_head): Linear(in_features=4096, out_features=32003, bias=False) ) 2023-11-02 18:06:49.671 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.embed_tokens.weight 2023-11-02 18:06:49.672 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.0.self_attn.q_proj.weight 2023-11-02 18:06:49.672 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.0.self_attn.k_proj.weight 2023-11-02 18:06:49.673 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.0.self_attn.v_proj.weight 2023-11-02 18:06:49.673 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.0.self_attn.o_proj.weight 2023-11-02 18:06:49.673 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.0.mlp.gate_proj.weight 2023-11-02 18:06:49.673 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.0.mlp.up_proj.weight 2023-11-02 18:06:49.673 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.0.mlp.down_proj.weight 2023-11-02 18:06:49.674 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.0.input_layernorm.weight 2023-11-02 18:06:49.674 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.0.post_attention_layernorm.weight 2023-11-02 18:06:49.674 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.1.self_attn.q_proj.weight 2023-11-02 18:06:49.674 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.1.self_attn.k_proj.weight 2023-11-02 18:06:49.674 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.1.self_attn.v_proj.weight 2023-11-02 18:06:49.675 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.1.self_attn.o_proj.weight 2023-11-02 18:06:49.675 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.1.mlp.gate_proj.weight 2023-11-02 18:06:49.675 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.1.mlp.up_proj.weight 2023-11-02 18:06:49.675 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.1.mlp.down_proj.weight 2023-11-02 18:06:49.675 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.1.input_layernorm.weight 2023-11-02 18:06:49.676 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.1.post_attention_layernorm.weight 2023-11-02 18:06:49.676 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.2.self_attn.q_proj.weight 2023-11-02 18:06:49.676 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.2.self_attn.k_proj.weight 2023-11-02 18:06:49.676 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.2.self_attn.v_proj.weight 2023-11-02 18:06:49.676 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.2.self_attn.o_proj.weight 2023-11-02 18:06:49.677 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.2.mlp.gate_proj.weight 2023-11-02 18:06:49.677 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.2.mlp.up_proj.weight 2023-11-02 18:06:49.677 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.2.mlp.down_proj.weight 2023-11-02 18:06:49.677 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.2.input_layernorm.weight 2023-11-02 18:06:49.677 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.2.post_attention_layernorm.weight 2023-11-02 18:06:49.677 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.3.self_attn.q_proj.weight 2023-11-02 18:06:49.678 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.3.self_attn.k_proj.weight 2023-11-02 18:06:49.678 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.3.self_attn.v_proj.weight 2023-11-02 18:06:49.678 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.3.self_attn.o_proj.weight 2023-11-02 18:06:49.678 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.3.mlp.gate_proj.weight 2023-11-02 18:06:49.678 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.3.mlp.up_proj.weight 2023-11-02 18:06:49.679 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.3.mlp.down_proj.weight 2023-11-02 18:06:49.679 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.3.input_layernorm.weight 2023-11-02 18:06:49.679 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.3.post_attention_layernorm.weight 2023-11-02 18:06:49.679 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.4.self_attn.q_proj.weight 2023-11-02 18:06:49.679 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.4.self_attn.k_proj.weight 2023-11-02 18:06:49.680 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.4.self_attn.v_proj.weight 2023-11-02 18:06:49.680 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.4.self_attn.o_proj.weight 2023-11-02 18:06:49.680 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.4.mlp.gate_proj.weight 2023-11-02 18:06:49.680 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.4.mlp.up_proj.weight 2023-11-02 18:06:49.680 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.4.mlp.down_proj.weight 2023-11-02 18:06:49.680 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.4.input_layernorm.weight 2023-11-02 18:06:49.681 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.4.post_attention_layernorm.weight 2023-11-02 18:06:49.681 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.5.self_attn.q_proj.weight 2023-11-02 18:06:49.681 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.5.self_attn.k_proj.weight 2023-11-02 18:06:49.681 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.5.self_attn.v_proj.weight 2023-11-02 18:06:49.681 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.5.self_attn.o_proj.weight 2023-11-02 18:06:49.682 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.5.mlp.gate_proj.weight 2023-11-02 18:06:49.682 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.5.mlp.up_proj.weight 2023-11-02 18:06:49.682 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.5.mlp.down_proj.weight 2023-11-02 18:06:49.682 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.5.input_layernorm.weight 2023-11-02 18:06:49.682 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.5.post_attention_layernorm.weight 2023-11-02 18:06:49.683 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.6.self_attn.q_proj.weight 2023-11-02 18:06:49.683 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.6.self_attn.k_proj.weight 2023-11-02 18:06:49.683 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.6.self_attn.v_proj.weight 2023-11-02 18:06:49.683 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.6.self_attn.o_proj.weight 2023-11-02 18:06:49.683 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.6.mlp.gate_proj.weight 2023-11-02 18:06:49.683 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.6.mlp.up_proj.weight 2023-11-02 18:06:49.684 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.6.mlp.down_proj.weight 2023-11-02 18:06:49.684 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.6.input_layernorm.weight 2023-11-02 18:06:49.684 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.6.post_attention_layernorm.weight 2023-11-02 18:06:49.684 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.7.self_attn.q_proj.weight 2023-11-02 18:06:49.684 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.7.self_attn.k_proj.weight 2023-11-02 18:06:49.685 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.7.self_attn.v_proj.weight 2023-11-02 18:06:49.685 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.7.self_attn.o_proj.weight 2023-11-02 18:06:49.685 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.7.mlp.gate_proj.weight 2023-11-02 18:06:49.685 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.7.mlp.up_proj.weight 2023-11-02 18:06:49.685 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.7.mlp.down_proj.weight 2023-11-02 18:06:49.685 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.7.input_layernorm.weight 2023-11-02 18:06:49.686 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.7.post_attention_layernorm.weight 2023-11-02 18:06:49.686 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.8.self_attn.q_proj.weight 2023-11-02 18:06:49.686 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.8.self_attn.k_proj.weight 2023-11-02 18:06:49.686 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.8.self_attn.v_proj.weight 2023-11-02 18:06:49.686 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.8.self_attn.o_proj.weight 2023-11-02 18:06:49.687 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.8.mlp.gate_proj.weight 2023-11-02 18:06:49.687 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.8.mlp.up_proj.weight 2023-11-02 18:06:49.687 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.8.mlp.down_proj.weight 2023-11-02 18:06:49.687 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.8.input_layernorm.weight 2023-11-02 18:06:49.687 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.8.post_attention_layernorm.weight 2023-11-02 18:06:49.687 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.9.self_attn.q_proj.weight 2023-11-02 18:06:49.688 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.9.self_attn.k_proj.weight 2023-11-02 18:06:49.688 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.9.self_attn.v_proj.weight 2023-11-02 18:06:49.688 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.9.self_attn.o_proj.weight 2023-11-02 18:06:49.688 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.9.mlp.gate_proj.weight 2023-11-02 18:06:49.688 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.9.mlp.up_proj.weight 2023-11-02 18:06:49.689 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.9.mlp.down_proj.weight 2023-11-02 18:06:49.689 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.9.input_layernorm.weight 2023-11-02 18:06:49.689 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.9.post_attention_layernorm.weight 2023-11-02 18:06:49.689 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.10.self_attn.q_proj.weight 2023-11-02 18:06:49.689 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.10.self_attn.k_proj.weight 2023-11-02 18:06:49.689 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.10.self_attn.v_proj.weight 2023-11-02 18:06:49.690 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.10.self_attn.o_proj.weight 2023-11-02 18:06:49.690 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.10.mlp.gate_proj.weight 2023-11-02 18:06:49.690 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.10.mlp.up_proj.weight 2023-11-02 18:06:49.690 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.10.mlp.down_proj.weight 2023-11-02 18:06:49.690 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.10.input_layernorm.weight 2023-11-02 18:06:49.691 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.10.post_attention_layernorm.weight 2023-11-02 18:06:49.691 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.11.self_attn.q_proj.weight 2023-11-02 18:06:49.691 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.11.self_attn.k_proj.weight 2023-11-02 18:06:49.691 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.11.self_attn.v_proj.weight 2023-11-02 18:06:49.691 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.11.self_attn.o_proj.weight 2023-11-02 18:06:49.692 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.11.mlp.gate_proj.weight 2023-11-02 18:06:49.692 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.11.mlp.up_proj.weight 2023-11-02 18:06:49.692 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.11.mlp.down_proj.weight 2023-11-02 18:06:49.692 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.11.input_layernorm.weight 2023-11-02 18:06:49.692 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.11.post_attention_layernorm.weight 2023-11-02 18:06:49.692 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.12.self_attn.q_proj.weight 2023-11-02 18:06:49.693 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.12.self_attn.k_proj.weight 2023-11-02 18:06:49.693 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.12.self_attn.v_proj.weight 2023-11-02 18:06:49.693 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.12.self_attn.o_proj.weight 2023-11-02 18:06:49.693 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.12.mlp.gate_proj.weight 2023-11-02 18:06:49.693 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.12.mlp.up_proj.weight 2023-11-02 18:06:49.694 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.12.mlp.down_proj.weight 2023-11-02 18:06:49.694 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.12.input_layernorm.weight 2023-11-02 18:06:49.694 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.12.post_attention_layernorm.weight 2023-11-02 18:06:49.694 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.13.self_attn.q_proj.weight 2023-11-02 18:06:49.694 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.13.self_attn.k_proj.weight 2023-11-02 18:06:49.694 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.13.self_attn.v_proj.weight 2023-11-02 18:06:49.695 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.13.self_attn.o_proj.weight 2023-11-02 18:06:49.695 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.13.mlp.gate_proj.weight 2023-11-02 18:06:49.695 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.13.mlp.up_proj.weight 2023-11-02 18:06:49.695 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.13.mlp.down_proj.weight 2023-11-02 18:06:49.695 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.13.input_layernorm.weight 2023-11-02 18:06:49.696 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.13.post_attention_layernorm.weight 2023-11-02 18:06:49.696 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.14.self_attn.q_proj.weight 2023-11-02 18:06:49.696 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.14.self_attn.k_proj.weight 2023-11-02 18:06:49.696 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.14.self_attn.v_proj.weight 2023-11-02 18:06:49.696 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.14.self_attn.o_proj.weight 2023-11-02 18:06:49.697 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.14.mlp.gate_proj.weight 2023-11-02 18:06:49.697 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.14.mlp.up_proj.weight 2023-11-02 18:06:49.697 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.14.mlp.down_proj.weight 2023-11-02 18:06:49.697 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.14.input_layernorm.weight 2023-11-02 18:06:49.697 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.14.post_attention_layernorm.weight 2023-11-02 18:06:49.697 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.15.self_attn.q_proj.weight 2023-11-02 18:06:49.698 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.15.self_attn.k_proj.weight 2023-11-02 18:06:49.698 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.15.self_attn.v_proj.weight 2023-11-02 18:06:49.698 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.15.self_attn.o_proj.weight 2023-11-02 18:06:49.698 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.15.mlp.gate_proj.weight 2023-11-02 18:06:49.698 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.15.mlp.up_proj.weight 2023-11-02 18:06:49.699 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.15.mlp.down_proj.weight 2023-11-02 18:06:49.699 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.15.input_layernorm.weight 2023-11-02 18:06:49.699 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.15.post_attention_layernorm.weight 2023-11-02 18:06:49.699 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.16.self_attn.q_proj.weight 2023-11-02 18:06:49.699 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.16.self_attn.k_proj.weight 2023-11-02 18:06:49.700 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.16.self_attn.v_proj.weight 2023-11-02 18:06:49.700 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.16.self_attn.o_proj.weight 2023-11-02 18:06:49.700 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.16.mlp.gate_proj.weight 2023-11-02 18:06:49.700 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.16.mlp.up_proj.weight 2023-11-02 18:06:49.700 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.16.mlp.down_proj.weight 2023-11-02 18:06:49.700 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.16.input_layernorm.weight 2023-11-02 18:06:49.701 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.16.post_attention_layernorm.weight 2023-11-02 18:06:49.701 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.17.self_attn.q_proj.weight 2023-11-02 18:06:49.701 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.17.self_attn.k_proj.weight 2023-11-02 18:06:49.701 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.17.self_attn.v_proj.weight 2023-11-02 18:06:49.701 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.17.self_attn.o_proj.weight 2023-11-02 18:06:49.702 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.17.mlp.gate_proj.weight 2023-11-02 18:06:49.702 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.17.mlp.up_proj.weight 2023-11-02 18:06:49.702 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.17.mlp.down_proj.weight 2023-11-02 18:06:49.702 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.17.input_layernorm.weight 2023-11-02 18:06:49.702 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.17.post_attention_layernorm.weight 2023-11-02 18:06:49.703 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.18.self_attn.q_proj.weight 2023-11-02 18:06:49.703 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.18.self_attn.k_proj.weight 2023-11-02 18:06:49.703 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.18.self_attn.v_proj.weight 2023-11-02 18:06:49.703 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.18.self_attn.o_proj.weight 2023-11-02 18:06:49.703 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.18.mlp.gate_proj.weight 2023-11-02 18:06:49.703 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.18.mlp.up_proj.weight 2023-11-02 18:06:49.704 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.18.mlp.down_proj.weight 2023-11-02 18:06:49.704 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.18.input_layernorm.weight 2023-11-02 18:06:49.704 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.18.post_attention_layernorm.weight 2023-11-02 18:06:49.704 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.19.self_attn.q_proj.weight 2023-11-02 18:06:49.704 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.19.self_attn.k_proj.weight 2023-11-02 18:06:49.705 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.19.self_attn.v_proj.weight 2023-11-02 18:06:49.705 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.19.self_attn.o_proj.weight 2023-11-02 18:06:49.705 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.19.mlp.gate_proj.weight 2023-11-02 18:06:49.705 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.19.mlp.up_proj.weight 2023-11-02 18:06:49.705 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.19.mlp.down_proj.weight 2023-11-02 18:06:49.705 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.19.input_layernorm.weight 2023-11-02 18:06:49.706 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.19.post_attention_layernorm.weight 2023-11-02 18:06:49.706 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.20.self_attn.q_proj.weight 2023-11-02 18:06:49.706 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.20.self_attn.k_proj.weight 2023-11-02 18:06:49.706 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.20.self_attn.v_proj.weight 2023-11-02 18:06:49.706 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.20.self_attn.o_proj.weight 2023-11-02 18:06:49.707 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.20.mlp.gate_proj.weight 2023-11-02 18:06:49.707 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.20.mlp.up_proj.weight 2023-11-02 18:06:49.707 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.20.mlp.down_proj.weight 2023-11-02 18:06:49.707 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.20.input_layernorm.weight 2023-11-02 18:06:49.707 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.20.post_attention_layernorm.weight 2023-11-02 18:06:49.707 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.21.self_attn.q_proj.weight 2023-11-02 18:06:49.708 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.21.self_attn.k_proj.weight 2023-11-02 18:06:49.708 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.21.self_attn.v_proj.weight 2023-11-02 18:06:49.708 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.21.self_attn.o_proj.weight 2023-11-02 18:06:49.708 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.21.mlp.gate_proj.weight 2023-11-02 18:06:49.708 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.21.mlp.up_proj.weight 2023-11-02 18:06:49.709 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.21.mlp.down_proj.weight 2023-11-02 18:06:49.709 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.21.input_layernorm.weight 2023-11-02 18:06:49.709 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.21.post_attention_layernorm.weight 2023-11-02 18:06:49.709 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.22.self_attn.q_proj.weight 2023-11-02 18:06:49.709 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.22.self_attn.k_proj.weight 2023-11-02 18:06:49.709 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.22.self_attn.v_proj.weight 2023-11-02 18:06:49.710 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.22.self_attn.o_proj.weight 2023-11-02 18:06:49.710 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.22.mlp.gate_proj.weight 2023-11-02 18:06:49.710 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.22.mlp.up_proj.weight 2023-11-02 18:06:49.710 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.22.mlp.down_proj.weight 2023-11-02 18:06:49.710 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.22.input_layernorm.weight 2023-11-02 18:06:49.711 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.22.post_attention_layernorm.weight 2023-11-02 18:06:49.711 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.23.self_attn.q_proj.weight 2023-11-02 18:06:49.711 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.23.self_attn.k_proj.weight 2023-11-02 18:06:49.711 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.23.self_attn.v_proj.weight 2023-11-02 18:06:49.711 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.23.self_attn.o_proj.weight 2023-11-02 18:06:49.711 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.23.mlp.gate_proj.weight 2023-11-02 18:06:49.712 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.23.mlp.up_proj.weight 2023-11-02 18:06:49.712 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.23.mlp.down_proj.weight 2023-11-02 18:06:49.712 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.23.input_layernorm.weight 2023-11-02 18:06:49.712 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.23.post_attention_layernorm.weight 2023-11-02 18:06:49.712 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.24.self_attn.q_proj.weight 2023-11-02 18:06:49.713 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.24.self_attn.k_proj.weight 2023-11-02 18:06:49.713 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.24.self_attn.v_proj.weight 2023-11-02 18:06:49.713 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.24.self_attn.o_proj.weight 2023-11-02 18:06:49.713 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.24.mlp.gate_proj.weight 2023-11-02 18:06:49.713 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.24.mlp.up_proj.weight 2023-11-02 18:06:49.713 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.24.mlp.down_proj.weight 2023-11-02 18:06:49.714 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.24.input_layernorm.weight 2023-11-02 18:06:49.714 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.24.post_attention_layernorm.weight 2023-11-02 18:06:49.714 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.25.self_attn.q_proj.weight 2023-11-02 18:06:49.714 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.25.self_attn.k_proj.weight 2023-11-02 18:06:49.714 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.25.self_attn.v_proj.weight 2023-11-02 18:06:49.715 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.25.self_attn.o_proj.weight 2023-11-02 18:06:49.715 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.25.mlp.gate_proj.weight 2023-11-02 18:06:49.715 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.25.mlp.up_proj.weight 2023-11-02 18:06:49.715 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.25.mlp.down_proj.weight 2023-11-02 18:06:49.715 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.25.input_layernorm.weight 2023-11-02 18:06:49.716 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.25.post_attention_layernorm.weight 2023-11-02 18:06:49.716 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.26.self_attn.q_proj.weight 2023-11-02 18:06:49.716 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.26.self_attn.k_proj.weight 2023-11-02 18:06:49.716 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.26.self_attn.v_proj.weight 2023-11-02 18:06:49.716 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.26.self_attn.o_proj.weight 2023-11-02 18:06:49.716 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.26.mlp.gate_proj.weight 2023-11-02 18:06:49.717 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.26.mlp.up_proj.weight 2023-11-02 18:06:49.717 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.26.mlp.down_proj.weight 2023-11-02 18:06:49.717 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.26.input_layernorm.weight 2023-11-02 18:06:49.717 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.26.post_attention_layernorm.weight 2023-11-02 18:06:49.717 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.27.self_attn.q_proj.weight 2023-11-02 18:06:49.718 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.27.self_attn.k_proj.weight 2023-11-02 18:06:49.718 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.27.self_attn.v_proj.weight 2023-11-02 18:06:49.718 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.27.self_attn.o_proj.weight 2023-11-02 18:06:49.718 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.27.mlp.gate_proj.weight 2023-11-02 18:06:49.718 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.27.mlp.up_proj.weight 2023-11-02 18:06:49.718 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.27.mlp.down_proj.weight 2023-11-02 18:06:49.719 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.27.input_layernorm.weight 2023-11-02 18:06:49.719 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.27.post_attention_layernorm.weight 2023-11-02 18:06:49.719 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.28.self_attn.q_proj.weight 2023-11-02 18:06:49.719 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.28.self_attn.k_proj.weight 2023-11-02 18:06:49.719 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.28.self_attn.v_proj.weight 2023-11-02 18:06:49.720 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.28.self_attn.o_proj.weight 2023-11-02 18:06:49.720 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.28.mlp.gate_proj.weight 2023-11-02 18:06:49.720 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.28.mlp.up_proj.weight 2023-11-02 18:06:49.720 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.28.mlp.down_proj.weight 2023-11-02 18:06:49.720 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.28.input_layernorm.weight 2023-11-02 18:06:49.720 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.28.post_attention_layernorm.weight 2023-11-02 18:06:49.721 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.29.self_attn.q_proj.weight 2023-11-02 18:06:49.721 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.29.self_attn.k_proj.weight 2023-11-02 18:06:49.721 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.29.self_attn.v_proj.weight 2023-11-02 18:06:49.721 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.29.self_attn.o_proj.weight 2023-11-02 18:06:49.721 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.29.mlp.gate_proj.weight 2023-11-02 18:06:49.721 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.29.mlp.up_proj.weight 2023-11-02 18:06:49.722 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.29.mlp.down_proj.weight 2023-11-02 18:06:49.722 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.29.input_layernorm.weight 2023-11-02 18:06:49.722 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.29.post_attention_layernorm.weight 2023-11-02 18:06:49.722 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.30.self_attn.q_proj.weight 2023-11-02 18:06:49.722 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.30.self_attn.k_proj.weight 2023-11-02 18:06:49.723 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.30.self_attn.v_proj.weight 2023-11-02 18:06:49.723 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.30.self_attn.o_proj.weight 2023-11-02 18:06:49.723 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.30.mlp.gate_proj.weight 2023-11-02 18:06:49.723 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.30.mlp.up_proj.weight 2023-11-02 18:06:49.723 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.30.mlp.down_proj.weight 2023-11-02 18:06:49.723 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.30.input_layernorm.weight 2023-11-02 18:06:49.724 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.30.post_attention_layernorm.weight 2023-11-02 18:06:49.724 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.31.self_attn.q_proj.weight 2023-11-02 18:06:49.724 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.31.self_attn.k_proj.weight 2023-11-02 18:06:49.724 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.31.self_attn.v_proj.weight 2023-11-02 18:06:49.724 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.31.self_attn.o_proj.weight 2023-11-02 18:06:49.725 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.31.mlp.gate_proj.weight 2023-11-02 18:06:49.725 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.31.mlp.up_proj.weight 2023-11-02 18:06:49.725 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.31.mlp.down_proj.weight 2023-11-02 18:06:49.725 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.31.input_layernorm.weight 2023-11-02 18:06:49.725 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.layers.31.post_attention_layernorm.weight 2023-11-02 18:06:49.725 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.norm.weight 2023-11-02 18:06:49.726 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.embeddings.class_embedding 2023-11-02 18:06:49.726 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight 2023-11-02 18:06:49.726 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight 2023-11-02 18:06:49.726 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight 2023-11-02 18:06:49.726 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias 2023-11-02 18:06:49.727 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight 2023-11-02 18:06:49.727 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias 2023-11-02 18:06:49.727 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight 2023-11-02 18:06:49.727 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias 2023-11-02 18:06:49.727 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight 2023-11-02 18:06:49.728 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias 2023-11-02 18:06:49.728 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight 2023-11-02 18:06:49.728 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias 2023-11-02 18:06:49.728 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight 2023-11-02 18:06:49.728 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias 2023-11-02 18:06:49.728 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight 2023-11-02 18:06:49.729 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias 2023-11-02 18:06:49.729 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight 2023-11-02 18:06:49.729 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias 2023-11-02 18:06:49.729 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight 2023-11-02 18:06:49.729 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias 2023-11-02 18:06:49.730 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight 2023-11-02 18:06:49.730 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias 2023-11-02 18:06:49.730 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight 2023-11-02 18:06:49.730 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias 2023-11-02 18:06:49.730 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight 2023-11-02 18:06:49.730 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias 2023-11-02 18:06:49.731 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight 2023-11-02 18:06:49.731 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias 2023-11-02 18:06:49.731 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight 2023-11-02 18:06:49.731 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias 2023-11-02 18:06:49.731 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight 2023-11-02 18:06:49.731 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias 2023-11-02 18:06:49.732 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight 2023-11-02 18:06:49.732 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias 2023-11-02 18:06:49.732 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight 2023-11-02 18:06:49.732 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias 2023-11-02 18:06:49.732 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight 2023-11-02 18:06:49.733 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias 2023-11-02 18:06:49.733 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight 2023-11-02 18:06:49.733 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias 2023-11-02 18:06:49.733 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight 2023-11-02 18:06:49.733 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias 2023-11-02 18:06:49.733 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight 2023-11-02 18:06:49.734 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias 2023-11-02 18:06:49.734 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight 2023-11-02 18:06:49.734 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias 2023-11-02 18:06:49.734 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight 2023-11-02 18:06:49.734 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias 2023-11-02 18:06:49.735 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight 2023-11-02 18:06:49.735 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias 2023-11-02 18:06:49.735 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight 2023-11-02 18:06:49.735 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias 2023-11-02 18:06:49.735 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight 2023-11-02 18:06:49.735 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias 2023-11-02 18:06:49.736 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight 2023-11-02 18:06:49.736 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias 2023-11-02 18:06:49.736 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight 2023-11-02 18:06:49.736 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias 2023-11-02 18:06:49.736 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight 2023-11-02 18:06:49.736 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias 2023-11-02 18:06:49.737 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight 2023-11-02 18:06:49.737 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias 2023-11-02 18:06:49.737 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight 2023-11-02 18:06:49.737 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias 2023-11-02 18:06:49.737 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight 2023-11-02 18:06:49.738 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias 2023-11-02 18:06:49.738 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight 2023-11-02 18:06:49.738 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias 2023-11-02 18:06:49.738 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight 2023-11-02 18:06:49.738 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias 2023-11-02 18:06:49.738 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight 2023-11-02 18:06:49.739 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias 2023-11-02 18:06:49.739 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight 2023-11-02 18:06:49.739 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias 2023-11-02 18:06:49.739 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight 2023-11-02 18:06:49.739 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias 2023-11-02 18:06:49.740 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight 2023-11-02 18:06:49.740 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias 2023-11-02 18:06:49.740 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight 2023-11-02 18:06:49.740 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias 2023-11-02 18:06:49.740 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight 2023-11-02 18:06:49.740 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias 2023-11-02 18:06:49.741 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight 2023-11-02 18:06:49.741 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias 2023-11-02 18:06:49.741 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight 2023-11-02 18:06:49.741 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias 2023-11-02 18:06:49.741 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight 2023-11-02 18:06:49.741 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias 2023-11-02 18:06:49.742 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight 2023-11-02 18:06:49.742 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias 2023-11-02 18:06:49.742 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight 2023-11-02 18:06:49.742 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias 2023-11-02 18:06:49.742 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight 2023-11-02 18:06:49.743 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias 2023-11-02 18:06:49.743 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight 2023-11-02 18:06:49.743 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias 2023-11-02 18:06:49.743 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight 2023-11-02 18:06:49.743 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias 2023-11-02 18:06:49.743 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight 2023-11-02 18:06:49.744 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias 2023-11-02 18:06:49.744 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight 2023-11-02 18:06:49.744 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias 2023-11-02 18:06:49.744 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight 2023-11-02 18:06:49.744 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias 2023-11-02 18:06:49.745 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight 2023-11-02 18:06:49.745 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias 2023-11-02 18:06:49.745 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight 2023-11-02 18:06:49.745 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias 2023-11-02 18:06:49.745 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight 2023-11-02 18:06:49.745 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias 2023-11-02 18:06:49.746 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight 2023-11-02 18:06:49.746 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias 2023-11-02 18:06:49.746 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight 2023-11-02 18:06:49.746 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias 2023-11-02 18:06:49.746 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight 2023-11-02 18:06:49.747 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias 2023-11-02 18:06:49.747 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight 2023-11-02 18:06:49.747 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias 2023-11-02 18:06:49.747 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight 2023-11-02 18:06:49.747 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias 2023-11-02 18:06:49.747 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight 2023-11-02 18:06:49.748 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias 2023-11-02 18:06:49.748 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight 2023-11-02 18:06:49.748 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias 2023-11-02 18:06:49.748 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight 2023-11-02 18:06:49.748 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias 2023-11-02 18:06:49.748 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight 2023-11-02 18:06:49.749 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias 2023-11-02 18:06:49.749 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight 2023-11-02 18:06:49.749 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias 2023-11-02 18:06:49.749 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight 2023-11-02 18:06:49.749 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias 2023-11-02 18:06:49.750 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight 2023-11-02 18:06:49.750 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias 2023-11-02 18:06:49.750 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight 2023-11-02 18:06:49.750 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias 2023-11-02 18:06:49.750 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight 2023-11-02 18:06:49.750 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias 2023-11-02 18:06:49.751 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight 2023-11-02 18:06:49.751 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias 2023-11-02 18:06:49.751 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight 2023-11-02 18:06:49.751 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias 2023-11-02 18:06:49.751 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight 2023-11-02 18:06:49.752 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias 2023-11-02 18:06:49.752 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight 2023-11-02 18:06:49.752 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias 2023-11-02 18:06:49.752 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight 2023-11-02 18:06:49.752 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias 2023-11-02 18:06:49.752 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight 2023-11-02 18:06:49.753 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias 2023-11-02 18:06:49.753 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight 2023-11-02 18:06:49.753 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias 2023-11-02 18:06:49.753 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight 2023-11-02 18:06:49.753 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias 2023-11-02 18:06:49.753 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight 2023-11-02 18:06:49.754 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias 2023-11-02 18:06:49.754 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight 2023-11-02 18:06:49.754 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias 2023-11-02 18:06:49.754 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight 2023-11-02 18:06:49.754 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias 2023-11-02 18:06:49.755 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight 2023-11-02 18:06:49.755 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias 2023-11-02 18:06:49.755 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight 2023-11-02 18:06:49.755 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias 2023-11-02 18:06:49.755 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight 2023-11-02 18:06:49.755 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias 2023-11-02 18:06:49.756 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight 2023-11-02 18:06:49.756 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias 2023-11-02 18:06:49.756 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight 2023-11-02 18:06:49.756 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias 2023-11-02 18:06:49.756 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight 2023-11-02 18:06:49.757 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias 2023-11-02 18:06:49.757 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight 2023-11-02 18:06:49.757 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias 2023-11-02 18:06:49.757 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight 2023-11-02 18:06:49.757 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias 2023-11-02 18:06:49.757 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight 2023-11-02 18:06:49.758 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias 2023-11-02 18:06:49.758 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight 2023-11-02 18:06:49.758 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias 2023-11-02 18:06:49.758 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight 2023-11-02 18:06:49.758 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias 2023-11-02 18:06:49.759 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight 2023-11-02 18:06:49.759 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias 2023-11-02 18:06:49.759 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight 2023-11-02 18:06:49.759 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias 2023-11-02 18:06:49.759 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight 2023-11-02 18:06:49.759 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias 2023-11-02 18:06:49.760 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight 2023-11-02 18:06:49.760 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias 2023-11-02 18:06:49.760 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight 2023-11-02 18:06:49.760 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias 2023-11-02 18:06:49.760 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight 2023-11-02 18:06:49.760 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias 2023-11-02 18:06:49.761 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight 2023-11-02 18:06:49.761 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias 2023-11-02 18:06:49.761 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight 2023-11-02 18:06:49.761 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias 2023-11-02 18:06:49.761 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight 2023-11-02 18:06:49.762 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias 2023-11-02 18:06:49.762 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight 2023-11-02 18:06:49.762 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias 2023-11-02 18:06:49.762 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight 2023-11-02 18:06:49.762 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias 2023-11-02 18:06:49.762 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight 2023-11-02 18:06:49.763 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias 2023-11-02 18:06:49.763 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight 2023-11-02 18:06:49.763 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias 2023-11-02 18:06:49.763 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight 2023-11-02 18:06:49.763 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias 2023-11-02 18:06:49.763 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight 2023-11-02 18:06:49.764 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias 2023-11-02 18:06:49.764 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight 2023-11-02 18:06:49.764 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias 2023-11-02 18:06:49.764 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight 2023-11-02 18:06:49.764 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias 2023-11-02 18:06:49.765 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight 2023-11-02 18:06:49.765 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias 2023-11-02 18:06:49.765 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight 2023-11-02 18:06:49.765 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias 2023-11-02 18:06:49.765 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight 2023-11-02 18:06:49.766 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias 2023-11-02 18:06:49.766 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight 2023-11-02 18:06:49.766 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias 2023-11-02 18:06:49.766 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight 2023-11-02 18:06:49.766 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias 2023-11-02 18:06:49.766 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight 2023-11-02 18:06:49.767 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias 2023-11-02 18:06:49.767 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight 2023-11-02 18:06:49.767 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias 2023-11-02 18:06:49.767 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight 2023-11-02 18:06:49.767 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias 2023-11-02 18:06:49.768 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight 2023-11-02 18:06:49.768 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias 2023-11-02 18:06:49.768 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight 2023-11-02 18:06:49.768 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias 2023-11-02 18:06:49.768 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight 2023-11-02 18:06:49.768 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias 2023-11-02 18:06:49.769 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight 2023-11-02 18:06:49.769 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias 2023-11-02 18:06:49.769 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight 2023-11-02 18:06:49.769 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias 2023-11-02 18:06:49.769 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight 2023-11-02 18:06:49.770 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias 2023-11-02 18:06:49.770 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight 2023-11-02 18:06:49.770 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias 2023-11-02 18:06:49.770 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight 2023-11-02 18:06:49.770 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias 2023-11-02 18:06:49.771 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight 2023-11-02 18:06:49.771 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias 2023-11-02 18:06:49.771 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight 2023-11-02 18:06:49.771 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias 2023-11-02 18:06:49.771 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight 2023-11-02 18:06:49.771 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias 2023-11-02 18:06:49.772 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight 2023-11-02 18:06:49.772 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias 2023-11-02 18:06:49.772 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight 2023-11-02 18:06:49.772 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias 2023-11-02 18:06:49.772 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight 2023-11-02 18:06:49.773 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias 2023-11-02 18:06:49.773 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight 2023-11-02 18:06:49.773 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias 2023-11-02 18:06:49.773 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight 2023-11-02 18:06:49.773 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias 2023-11-02 18:06:49.773 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight 2023-11-02 18:06:49.774 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias 2023-11-02 18:06:49.774 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight 2023-11-02 18:06:49.774 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias 2023-11-02 18:06:49.774 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight 2023-11-02 18:06:49.774 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias 2023-11-02 18:06:49.775 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight 2023-11-02 18:06:49.775 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias 2023-11-02 18:06:49.775 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight 2023-11-02 18:06:49.775 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias 2023-11-02 18:06:49.775 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight 2023-11-02 18:06:49.775 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias 2023-11-02 18:06:49.776 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight 2023-11-02 18:06:49.776 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias 2023-11-02 18:06:49.776 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight 2023-11-02 18:06:49.776 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias 2023-11-02 18:06:49.776 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight 2023-11-02 18:06:49.776 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias 2023-11-02 18:06:49.777 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight 2023-11-02 18:06:49.777 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias 2023-11-02 18:06:49.777 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight 2023-11-02 18:06:49.777 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias 2023-11-02 18:06:49.777 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight 2023-11-02 18:06:49.778 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias 2023-11-02 18:06:49.778 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight 2023-11-02 18:06:49.778 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias 2023-11-02 18:06:49.778 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight 2023-11-02 18:06:49.778 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias 2023-11-02 18:06:49.778 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight 2023-11-02 18:06:49.779 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias 2023-11-02 18:06:49.779 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight 2023-11-02 18:06:49.779 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias 2023-11-02 18:06:49.779 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight 2023-11-02 18:06:49.779 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias 2023-11-02 18:06:49.779 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight 2023-11-02 18:06:49.780 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias 2023-11-02 18:06:49.780 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight 2023-11-02 18:06:49.780 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias 2023-11-02 18:06:49.780 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight 2023-11-02 18:06:49.780 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias 2023-11-02 18:06:49.781 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight 2023-11-02 18:06:49.781 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias 2023-11-02 18:06:49.781 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight 2023-11-02 18:06:49.781 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias 2023-11-02 18:06:49.781 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight 2023-11-02 18:06:49.781 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias 2023-11-02 18:06:49.782 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight 2023-11-02 18:06:49.782 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias 2023-11-02 18:06:49.782 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight 2023-11-02 18:06:49.782 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias 2023-11-02 18:06:49.782 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight 2023-11-02 18:06:49.782 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias 2023-11-02 18:06:49.783 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight 2023-11-02 18:06:49.783 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias 2023-11-02 18:06:49.783 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight 2023-11-02 18:06:49.783 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias 2023-11-02 18:06:49.783 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight 2023-11-02 18:06:49.784 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias 2023-11-02 18:06:49.784 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight 2023-11-02 18:06:49.784 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias 2023-11-02 18:06:49.784 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight 2023-11-02 18:06:49.784 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias 2023-11-02 18:06:49.784 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight 2023-11-02 18:06:49.785 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias 2023-11-02 18:06:49.785 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight 2023-11-02 18:06:49.785 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias 2023-11-02 18:06:49.785 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight 2023-11-02 18:06:49.785 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias 2023-11-02 18:06:49.786 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight 2023-11-02 18:06:49.786 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias 2023-11-02 18:06:49.786 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight 2023-11-02 18:06:49.786 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias 2023-11-02 18:06:49.786 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight 2023-11-02 18:06:49.786 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias 2023-11-02 18:06:49.787 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight 2023-11-02 18:06:49.787 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias 2023-11-02 18:06:49.787 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight 2023-11-02 18:06:49.787 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias 2023-11-02 18:06:49.787 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight 2023-11-02 18:06:49.787 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias 2023-11-02 18:06:49.788 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight 2023-11-02 18:06:49.788 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias 2023-11-02 18:06:49.788 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight 2023-11-02 18:06:49.788 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias 2023-11-02 18:06:49.788 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight 2023-11-02 18:06:49.789 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias 2023-11-02 18:06:49.789 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight 2023-11-02 18:06:49.789 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias 2023-11-02 18:06:49.789 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight 2023-11-02 18:06:49.789 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias 2023-11-02 18:06:49.789 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight 2023-11-02 18:06:49.790 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias 2023-11-02 18:06:49.790 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight 2023-11-02 18:06:49.790 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias 2023-11-02 18:06:49.790 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight 2023-11-02 18:06:49.790 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias 2023-11-02 18:06:49.790 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight 2023-11-02 18:06:49.791 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias 2023-11-02 18:06:49.791 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight 2023-11-02 18:06:49.791 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias 2023-11-02 18:06:49.791 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight 2023-11-02 18:06:49.791 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias 2023-11-02 18:06:49.792 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight 2023-11-02 18:06:49.792 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias 2023-11-02 18:06:49.792 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight 2023-11-02 18:06:49.792 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias 2023-11-02 18:06:49.792 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight 2023-11-02 18:06:49.792 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias 2023-11-02 18:06:49.793 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight 2023-11-02 18:06:49.793 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias 2023-11-02 18:06:49.793 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight 2023-11-02 18:06:49.793 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias 2023-11-02 18:06:49.793 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight 2023-11-02 18:06:49.793 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias 2023-11-02 18:06:49.794 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight 2023-11-02 18:06:49.794 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias 2023-11-02 18:06:49.794 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight 2023-11-02 18:06:49.794 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias 2023-11-02 18:06:49.794 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight 2023-11-02 18:06:49.795 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias 2023-11-02 18:06:49.795 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight 2023-11-02 18:06:49.795 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias 2023-11-02 18:06:49.795 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight 2023-11-02 18:06:49.795 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias 2023-11-02 18:06:49.795 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.post_layernorm.weight 2023-11-02 18:06:49.796 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.vision_tower.vision_tower.vision_model.post_layernorm.bias 2023-11-02 18:06:49.796 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.projector.projector.weight 2023-11-02 18:06:49.796 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: model.projector.projector.bias 2023-11-02 18:06:49.796 | INFO | mmgpt.utils.logger:log_model_parameters:194 - -> Trainable Parameters: lm_head.weight 2023-11-02 18:06:49.800 | INFO | mmgpt.utils.logger:log_model_parameters:199 - >> Total params: 6752.17M 2023-11-02 18:06:49.800 | INFO | mmgpt.utils.logger:log_model_parameters:200 - >> Train params: 6752.17M, Ratio 100.00% 2023-11-02 18:06:49.817 | INFO | mmgpt.data.dataset.pair_webdataset:__init__:53 - 1666666 interleaved (6-merged) image-text pairs (splitted to 32 workers) are sampled from dataset: laion2b_10m_6merge. 2023-11-02 18:06:50.060 | INFO | mmgpt.data.dataset.pair_webdataset:__init__:53 - 833333 interleaved (6-merged) image-text pairs (splitted to 32 workers) are sampled from dataset: grit_5m_6merge. 2023-11-02 18:06:50.070 | INFO | mmgpt.data.dataset.interpair_webdataset:__init__:51 - 500000 interleaved (2-merged) image-text pairs (splitted to 32 workers) are sampled from dataset: track_1m_v1_2merge. 2023-11-02 18:06:50.083 | INFO | mmgpt.data.dataset.interpair_webdataset:__init__:51 - 1250000 interleaved (4-merged) image-text pairs (splitted to 32 workers) are sampled from dataset: det_5m_v1_en_4merge. 2023-11-02 18:06:50.084 | INFO | mmgpt.data.builder:build_dataloader:65 - After processing, totally 4249999 samples are involved. 2023-11-02 18:06:50.255 | INFO | mmgpt.engine.train.trainer:create_optimizer:62 - ->> Number of Optimizer Groups: 50 2023-11-02 18:06:50.256 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 0: 233 groups of parameters maintains a learning rate of 5e-05 2023-11-02 18:06:50.256 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 1: 2 groups of parameters maintains a learning rate of 5e-06 2023-11-02 18:06:50.256 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 2: 6 groups of parameters maintains a learning rate of 4.923854510918059e-06 2023-11-02 18:06:50.256 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 3: 6 groups of parameters maintains a learning rate of 5.470949456575621e-06 2023-11-02 18:06:50.257 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 4: 6 groups of parameters maintains a learning rate of 6.078832729528468e-06 2023-11-02 18:06:50.257 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 5: 6 groups of parameters maintains a learning rate of 6.7542585883649645e-06 2023-11-02 18:06:50.257 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 6: 6 groups of parameters maintains a learning rate of 7.504731764849959e-06 2023-11-02 18:06:50.257 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 7: 6 groups of parameters maintains a learning rate of 8.338590849833288e-06 2023-11-02 18:06:50.257 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 8: 6 groups of parameters maintains a learning rate of 9.265100944259208e-06 2023-11-02 18:06:50.257 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 9: 6 groups of parameters maintains a learning rate of 1.0294556604732453e-05 2023-11-02 18:06:50.258 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 10: 6 groups of parameters maintains a learning rate of 1.1438396227480504e-05 2023-11-02 18:06:50.258 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 11: 6 groups of parameters maintains a learning rate of 1.2709329141645005e-05 2023-11-02 18:06:50.258 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 12: 6 groups of parameters maintains a learning rate of 1.4121476824050005e-05 2023-11-02 18:06:50.258 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 13: 6 groups of parameters maintains a learning rate of 1.5690529804500005e-05 2023-11-02 18:06:50.258 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 14: 6 groups of parameters maintains a learning rate of 1.7433922005000004e-05 2023-11-02 18:06:50.259 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 15: 6 groups of parameters maintains a learning rate of 1.9371024450000006e-05 2023-11-02 18:06:50.259 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 16: 6 groups of parameters maintains a learning rate of 2.1523360500000007e-05 2023-11-02 18:06:50.259 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 17: 6 groups of parameters maintains a learning rate of 2.3914845000000007e-05 2023-11-02 18:06:50.259 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 18: 6 groups of parameters maintains a learning rate of 2.6572050000000003e-05 2023-11-02 18:06:50.259 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 19: 6 groups of parameters maintains a learning rate of 2.9524500000000005e-05 2023-11-02 18:06:50.259 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 20: 6 groups of parameters maintains a learning rate of 3.2805e-05 2023-11-02 18:06:50.260 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 21: 6 groups of parameters maintains a learning rate of 3.6450000000000005e-05 2023-11-02 18:06:50.260 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 22: 6 groups of parameters maintains a learning rate of 4.05e-05 2023-11-02 18:06:50.260 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 23: 6 groups of parameters maintains a learning rate of 4.5e-05 2023-11-02 18:06:50.260 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 24: 6 groups of parameters maintains a learning rate of 5.555555555555556e-05 2023-11-02 18:06:50.260 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 25: 76 groups of parameters maintains a learning rate of 5e-05 2023-11-02 18:06:50.261 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 26: 5 groups of parameters maintains a learning rate of 5e-06 2023-11-02 18:06:50.261 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 27: 10 groups of parameters maintains a learning rate of 4.923854510918059e-06 2023-11-02 18:06:50.261 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 28: 10 groups of parameters maintains a learning rate of 5.470949456575621e-06 2023-11-02 18:06:50.261 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 29: 10 groups of parameters maintains a learning rate of 6.078832729528468e-06 2023-11-02 18:06:50.261 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 30: 10 groups of parameters maintains a learning rate of 6.7542585883649645e-06 2023-11-02 18:06:50.261 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 31: 10 groups of parameters maintains a learning rate of 7.504731764849959e-06 2023-11-02 18:06:50.262 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 32: 10 groups of parameters maintains a learning rate of 8.338590849833288e-06 2023-11-02 18:06:50.262 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 33: 10 groups of parameters maintains a learning rate of 9.265100944259208e-06 2023-11-02 18:06:50.262 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 34: 10 groups of parameters maintains a learning rate of 1.0294556604732453e-05 2023-11-02 18:06:50.262 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 35: 10 groups of parameters maintains a learning rate of 1.1438396227480504e-05 2023-11-02 18:06:50.262 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 36: 10 groups of parameters maintains a learning rate of 1.2709329141645005e-05 2023-11-02 18:06:50.262 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 37: 10 groups of parameters maintains a learning rate of 1.4121476824050005e-05 2023-11-02 18:06:50.263 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 38: 10 groups of parameters maintains a learning rate of 1.5690529804500005e-05 2023-11-02 18:06:50.263 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 39: 10 groups of parameters maintains a learning rate of 1.7433922005000004e-05 2023-11-02 18:06:50.263 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 40: 10 groups of parameters maintains a learning rate of 1.9371024450000006e-05 2023-11-02 18:06:50.263 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 41: 10 groups of parameters maintains a learning rate of 2.1523360500000007e-05 2023-11-02 18:06:50.263 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 42: 10 groups of parameters maintains a learning rate of 2.3914845000000007e-05 2023-11-02 18:06:50.264 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 43: 10 groups of parameters maintains a learning rate of 2.6572050000000003e-05 2023-11-02 18:06:50.264 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 44: 10 groups of parameters maintains a learning rate of 2.9524500000000005e-05 2023-11-02 18:06:50.264 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 45: 10 groups of parameters maintains a learning rate of 3.2805e-05 2023-11-02 18:06:50.264 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 46: 10 groups of parameters maintains a learning rate of 3.6450000000000005e-05 2023-11-02 18:06:50.264 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 47: 10 groups of parameters maintains a learning rate of 4.05e-05 2023-11-02 18:06:50.264 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 48: 10 groups of parameters maintains a learning rate of 4.5e-05 2023-11-02 18:06:50.265 | INFO | mmgpt.engine.train.trainer:create_optimizer:64 - *********>> 49: 10 groups of parameters maintains a learning rate of 5.555555555555556e-05 2023-11-02 18:07:08.846 | INFO | mmgpt.data.dataset.interpair_webdataset:token_processor:114 - exceeding max length 2048, ignore last 1 samples! 2023-11-02 18:07:08.846 | INFO | mmgpt.data.dataset.interpair_webdataset:token_processor:115 - ('Given a video clip including frame1,frame2 and frame3, can you tell me what thisframe:1:[404, 245, 658, 622];frame:2:[381, 212, 621, 577];frame:3:[113, 219, 350, 590] is?Craft a concise reply using the image frames and trajectory specifics you have at hand.', 'This is a/an a brown dog walks towards the right and then barks to the other two dogs') 2023-11-02 18:07:22.249 | INFO | mmgpt.data.dataset.interpair_webdataset:token_processor:114 - exceeding max length 2048, ignore last 1 samples! 2023-11-02 18:07:22.250 | INFO | mmgpt.data.dataset.interpair_webdataset:token_processor:115 - ('\nDetect all.The category:[xmin,ymin,xmax,ymax] format should be rigorously followed in your response.', 'shelf:[001, 306, 261, 999].') 2023-11-02 18:07:23.406 | INFO | mmgpt.data.dataset.interpair_webdataset:token_processor:114 - exceeding max length 2048, ignore last 1 samples! 2023-11-02 18:07:23.407 | INFO | mmgpt.data.dataset.interpair_webdataset:token_processor:115 - ('Given a video clip including frame1,frame2,frame3 and frame4,can you tell me what is thisFrame1:[360, 435, 491, 531] and track its trajectory.All trajectories in your reply should conform to the Frame t:[xmin,ymin,xmax,ymax] pattern.', 'This is a snowmobilFrame1:[360, 435, 491, 531];Frame2:[324, 498, 466, 615];Frame3:[314, 497, 478, 613];Frame4:[289, 538, 453, 655].') 2023-11-02 18:07:40.105 | INFO | mmgpt.data.dataset.interpair_webdataset:token_processor:114 - exceeding max length 2048, ignore last 1 samples! 2023-11-02 18:07:40.105 | INFO | mmgpt.data.dataset.interpair_webdataset:token_processor:115 - ('\nDetect all.When composing your answer, be sure to consistently utilize the category:[xmin,ymin,xmax,ymax] structure.', 'wheel:[539, 555, 592, 651],[440, 558, 497, 648];bicycle:[437, 518, 595, 639];bicycle wheel:[438, 545, 495, 654],[536, 551, 593, 657];person:[468, 403, 571, 627];tree:[000, 000, 201, 701],[005, 330, 096, 537],[104, 000, 304, 533],[132, 373, 183, 510],[210, 000, 448, 608],[212, 379, 303, 536],[270, 384, 349, 533],[312, 393, 434, 531],[407, 402, 520, 530],[418, 000, 999, 608],[483, 361, 583, 519],[555, 373, 643, 518],[620, 363, 665, 521],[721, 349, 810, 557],[734, 159, 851, 629],[794, 343, 895, 545],[877, 351, 972, 539],[916, 345, 999, 537].') 2023-11-02 18:07:54.365 | INFO | mmgpt.data.dataset.interpair_webdataset:token_processor:114 - exceeding max length 2048, ignore last 1 samples! 2023-11-02 18:07:54.366 | INFO | mmgpt.data.dataset.interpair_webdataset:token_processor:115 - ('\nDetect all.For your response, please adhere to the specified category:[xmin,ymin,xmax,ymax] format.', 'clothing:[080, 029, 970, 946].') 2023-11-02 18:07:58.699 | INFO | mmgpt.data.dataset.interpair_webdataset:token_processor:114 - exceeding max length 2048, ignore last 1 samples! 2023-11-02 18:07:58.700 | INFO | mmgpt.data.dataset.interpair_webdataset:token_processor:115 - ('Given frame1: and frame2:,track carFrame1:[554, 495, 604, 570],carFrame1:[368, 483, 411, 555],carFrame1:[467, 487, 506, 547] in this video clip.Your reply should be in alignment with the classFrame t:[xmin,ymin,xmax,ymax] structure.', 'carFrame1:[554, 495, 604, 570];Frame2:[554, 498, 605, 572],carFrame1:[368, 483, 411, 555];Frame2:[367, 487, 411, 558],carFrame1:[467, 487, 506, 547];Frame2:[470, 487, 505, 546].') 2023-11-02 18:08:52.934 | INFO | mmgpt.data.dataset.pair_webdataset:token_processor:102 - exceeding max length 2048, ignore last 1 samples! 2023-11-02 18:08:52.934 | INFO | mmgpt.data.dataset.pair_webdataset:token_processor:103 - (None, '[559, 566, 706, 805] A father and [606, 621, 710, 792] daughter sit on [510, 777, 852, 997] rocks looking out over [000, 481, 998, 882] a loch')