|
2023-11-16 06:11:33,784 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 06:11:33,786 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): XLMRobertaModel( |
|
(embeddings): XLMRobertaEmbeddings( |
|
(word_embeddings): Embedding(250003, 1024) |
|
(position_embeddings): Embedding(514, 1024, padding_idx=1) |
|
(token_type_embeddings): Embedding(1, 1024) |
|
(LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): XLMRobertaEncoder( |
|
(layer): ModuleList( |
|
(0-23): 24 x XLMRobertaLayer( |
|
(attention): XLMRobertaAttention( |
|
(self): XLMRobertaSelfAttention( |
|
(query): Linear(in_features=1024, out_features=1024, bias=True) |
|
(key): Linear(in_features=1024, out_features=1024, bias=True) |
|
(value): Linear(in_features=1024, out_features=1024, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): XLMRobertaSelfOutput( |
|
(dense): Linear(in_features=1024, out_features=1024, bias=True) |
|
(LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): XLMRobertaIntermediate( |
|
(dense): Linear(in_features=1024, out_features=4096, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): XLMRobertaOutput( |
|
(dense): Linear(in_features=4096, out_features=1024, bias=True) |
|
(LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): XLMRobertaPooler( |
|
(dense): Linear(in_features=1024, out_features=1024, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1024, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-11-16 06:11:33,786 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 06:11:33,786 MultiCorpus: 30000 train + 10000 dev + 10000 test sentences |
|
- ColumnCorpus Corpus: 20000 train + 0 dev + 0 test sentences - /root/.flair/datasets/ner_multi_xtreme/en |
|
- ColumnCorpus Corpus: 10000 train + 10000 dev + 10000 test sentences - /root/.flair/datasets/ner_multi_xtreme/ka |
|
2023-11-16 06:11:33,786 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 06:11:33,786 Train: 30000 sentences |
|
2023-11-16 06:11:33,786 (train_with_dev=False, train_with_test=False) |
|
2023-11-16 06:11:33,786 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 06:11:33,786 Training Params: |
|
2023-11-16 06:11:33,786 - learning_rate: "5e-06" |
|
2023-11-16 06:11:33,786 - mini_batch_size: "4" |
|
2023-11-16 06:11:33,786 - max_epochs: "10" |
|
2023-11-16 06:11:33,786 - shuffle: "True" |
|
2023-11-16 06:11:33,786 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 06:11:33,786 Plugins: |
|
2023-11-16 06:11:33,786 - TensorboardLogger |
|
2023-11-16 06:11:33,786 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-11-16 06:11:33,786 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 06:11:33,786 Final evaluation on model from best epoch (best-model.pt) |
|
2023-11-16 06:11:33,787 - metric: "('micro avg', 'f1-score')" |
|
2023-11-16 06:11:33,787 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 06:11:33,787 Computation: |
|
2023-11-16 06:11:33,787 - compute on device: cuda:0 |
|
2023-11-16 06:11:33,787 - embedding storage: none |
|
2023-11-16 06:11:33,787 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 06:11:33,787 Model training base path: "autotrain-flair-georgian-ner-xlm_r_large-bs4-e10-lr5e-06-4" |
|
2023-11-16 06:11:33,787 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 06:11:33,787 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 06:11:33,787 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-11-16 06:13:08,349 epoch 1 - iter 750/7500 - loss 2.53865900 - time (sec): 94.56 - samples/sec: 253.75 - lr: 0.000000 - momentum: 0.000000 |
|
2023-11-16 06:14:42,568 epoch 1 - iter 1500/7500 - loss 2.13967550 - time (sec): 188.78 - samples/sec: 256.47 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 06:16:16,468 epoch 1 - iter 2250/7500 - loss 1.90406353 - time (sec): 282.68 - samples/sec: 256.63 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 06:17:48,884 epoch 1 - iter 3000/7500 - loss 1.67899229 - time (sec): 375.10 - samples/sec: 256.80 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 06:19:19,876 epoch 1 - iter 3750/7500 - loss 1.48518547 - time (sec): 466.09 - samples/sec: 258.11 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 06:20:52,111 epoch 1 - iter 4500/7500 - loss 1.33429739 - time (sec): 558.32 - samples/sec: 259.20 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 06:22:25,071 epoch 1 - iter 5250/7500 - loss 1.22009996 - time (sec): 651.28 - samples/sec: 258.72 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 06:23:57,896 epoch 1 - iter 6000/7500 - loss 1.13315230 - time (sec): 744.11 - samples/sec: 258.55 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 06:25:28,930 epoch 1 - iter 6750/7500 - loss 1.06203012 - time (sec): 835.14 - samples/sec: 259.35 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 06:27:03,002 epoch 1 - iter 7500/7500 - loss 1.00081782 - time (sec): 929.21 - samples/sec: 259.14 - lr: 0.000005 - momentum: 0.000000 |
|
2023-11-16 06:27:03,005 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 06:27:03,005 EPOCH 1 done: loss 1.0008 - lr: 0.000005 |
|
2023-11-16 06:27:30,760 DEV : loss 0.3205418884754181 - f1-score (micro avg) 0.7957 |
|
2023-11-16 06:27:33,296 saving best model |
|
2023-11-16 06:27:35,260 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 06:29:08,221 epoch 2 - iter 750/7500 - loss 0.40584352 - time (sec): 92.96 - samples/sec: 259.85 - lr: 0.000005 - momentum: 0.000000 |
|
2023-11-16 06:30:41,767 epoch 2 - iter 1500/7500 - loss 0.41800053 - time (sec): 186.50 - samples/sec: 258.62 - lr: 0.000005 - momentum: 0.000000 |
|
2023-11-16 06:32:13,981 epoch 2 - iter 2250/7500 - loss 0.40515032 - time (sec): 278.72 - samples/sec: 260.55 - lr: 0.000005 - momentum: 0.000000 |
|
2023-11-16 06:33:44,785 epoch 2 - iter 3000/7500 - loss 0.40416870 - time (sec): 369.52 - samples/sec: 261.05 - lr: 0.000005 - momentum: 0.000000 |
|
2023-11-16 06:35:15,515 epoch 2 - iter 3750/7500 - loss 0.40544240 - time (sec): 460.25 - samples/sec: 263.21 - lr: 0.000005 - momentum: 0.000000 |
|
2023-11-16 06:36:48,219 epoch 2 - iter 4500/7500 - loss 0.40263197 - time (sec): 552.96 - samples/sec: 262.37 - lr: 0.000005 - momentum: 0.000000 |
|
2023-11-16 06:38:23,766 epoch 2 - iter 5250/7500 - loss 0.39942117 - time (sec): 648.50 - samples/sec: 260.48 - lr: 0.000005 - momentum: 0.000000 |
|
2023-11-16 06:39:57,620 epoch 2 - iter 6000/7500 - loss 0.40065088 - time (sec): 742.36 - samples/sec: 259.79 - lr: 0.000005 - momentum: 0.000000 |
|
2023-11-16 06:41:29,077 epoch 2 - iter 6750/7500 - loss 0.39965016 - time (sec): 833.81 - samples/sec: 260.34 - lr: 0.000005 - momentum: 0.000000 |
|
2023-11-16 06:43:02,733 epoch 2 - iter 7500/7500 - loss 0.39861413 - time (sec): 927.47 - samples/sec: 259.63 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 06:43:02,736 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 06:43:02,736 EPOCH 2 done: loss 0.3986 - lr: 0.000004 |
|
2023-11-16 06:43:29,322 DEV : loss 0.2607610523700714 - f1-score (micro avg) 0.8643 |
|
2023-11-16 06:43:31,142 saving best model |
|
2023-11-16 06:43:33,553 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 06:45:08,034 epoch 3 - iter 750/7500 - loss 0.37315879 - time (sec): 94.48 - samples/sec: 253.04 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 06:46:39,743 epoch 3 - iter 1500/7500 - loss 0.35743568 - time (sec): 186.18 - samples/sec: 256.62 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 06:48:12,901 epoch 3 - iter 2250/7500 - loss 0.35305153 - time (sec): 279.34 - samples/sec: 259.77 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 06:49:45,353 epoch 3 - iter 3000/7500 - loss 0.35234824 - time (sec): 371.79 - samples/sec: 259.85 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 06:51:20,378 epoch 3 - iter 3750/7500 - loss 0.35046792 - time (sec): 466.82 - samples/sec: 258.57 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 06:52:53,238 epoch 3 - iter 4500/7500 - loss 0.35142197 - time (sec): 559.68 - samples/sec: 259.89 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 06:54:23,949 epoch 3 - iter 5250/7500 - loss 0.34665555 - time (sec): 650.39 - samples/sec: 260.41 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 06:55:56,860 epoch 3 - iter 6000/7500 - loss 0.35003084 - time (sec): 743.30 - samples/sec: 259.59 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 06:57:30,078 epoch 3 - iter 6750/7500 - loss 0.34700719 - time (sec): 836.52 - samples/sec: 259.35 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 06:59:02,988 epoch 3 - iter 7500/7500 - loss 0.34834444 - time (sec): 929.43 - samples/sec: 259.08 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 06:59:02,990 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 06:59:02,990 EPOCH 3 done: loss 0.3483 - lr: 0.000004 |
|
2023-11-16 06:59:30,217 DEV : loss 0.2834814190864563 - f1-score (micro avg) 0.881 |
|
2023-11-16 06:59:32,866 saving best model |
|
2023-11-16 06:59:35,803 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 07:01:09,960 epoch 4 - iter 750/7500 - loss 0.29042774 - time (sec): 94.15 - samples/sec: 256.86 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 07:02:45,034 epoch 4 - iter 1500/7500 - loss 0.28875226 - time (sec): 189.23 - samples/sec: 258.73 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 07:04:20,800 epoch 4 - iter 2250/7500 - loss 0.30241778 - time (sec): 284.99 - samples/sec: 255.97 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 07:05:55,249 epoch 4 - iter 3000/7500 - loss 0.30810931 - time (sec): 379.44 - samples/sec: 254.41 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 07:07:28,778 epoch 4 - iter 3750/7500 - loss 0.30459660 - time (sec): 472.97 - samples/sec: 255.40 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 07:08:59,582 epoch 4 - iter 4500/7500 - loss 0.30550384 - time (sec): 563.77 - samples/sec: 257.73 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 07:10:31,165 epoch 4 - iter 5250/7500 - loss 0.30595152 - time (sec): 655.36 - samples/sec: 258.24 - lr: 0.000004 - momentum: 0.000000 |
|
2023-11-16 07:12:04,192 epoch 4 - iter 6000/7500 - loss 0.30648476 - time (sec): 748.38 - samples/sec: 258.00 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 07:13:38,216 epoch 4 - iter 6750/7500 - loss 0.30712803 - time (sec): 842.41 - samples/sec: 257.62 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 07:15:12,386 epoch 4 - iter 7500/7500 - loss 0.30384345 - time (sec): 936.58 - samples/sec: 257.10 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 07:15:12,389 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 07:15:12,389 EPOCH 4 done: loss 0.3038 - lr: 0.000003 |
|
2023-11-16 07:15:39,642 DEV : loss 0.2750042676925659 - f1-score (micro avg) 0.8871 |
|
2023-11-16 07:15:41,637 saving best model |
|
2023-11-16 07:15:44,075 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 07:17:17,606 epoch 5 - iter 750/7500 - loss 0.22837945 - time (sec): 93.53 - samples/sec: 253.79 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 07:18:50,674 epoch 5 - iter 1500/7500 - loss 0.24801582 - time (sec): 186.59 - samples/sec: 255.33 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 07:20:23,354 epoch 5 - iter 2250/7500 - loss 0.24364625 - time (sec): 279.27 - samples/sec: 258.70 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 07:21:53,572 epoch 5 - iter 3000/7500 - loss 0.25086533 - time (sec): 369.49 - samples/sec: 261.28 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 07:23:27,878 epoch 5 - iter 3750/7500 - loss 0.25125342 - time (sec): 463.80 - samples/sec: 260.45 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 07:24:59,899 epoch 5 - iter 4500/7500 - loss 0.25211752 - time (sec): 555.82 - samples/sec: 259.74 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 07:26:31,650 epoch 5 - iter 5250/7500 - loss 0.25096563 - time (sec): 647.57 - samples/sec: 259.86 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 07:28:06,382 epoch 5 - iter 6000/7500 - loss 0.25437307 - time (sec): 742.30 - samples/sec: 258.90 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 07:29:38,647 epoch 5 - iter 6750/7500 - loss 0.25716650 - time (sec): 834.57 - samples/sec: 259.17 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 07:31:13,423 epoch 5 - iter 7500/7500 - loss 0.25526851 - time (sec): 929.34 - samples/sec: 259.10 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 07:31:13,426 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 07:31:13,427 EPOCH 5 done: loss 0.2553 - lr: 0.000003 |
|
2023-11-16 07:31:40,891 DEV : loss 0.2662450671195984 - f1-score (micro avg) 0.8974 |
|
2023-11-16 07:31:43,349 saving best model |
|
2023-11-16 07:31:46,083 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 07:33:16,627 epoch 6 - iter 750/7500 - loss 0.19587155 - time (sec): 90.54 - samples/sec: 263.71 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 07:34:47,310 epoch 6 - iter 1500/7500 - loss 0.20788294 - time (sec): 181.22 - samples/sec: 265.84 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 07:36:21,324 epoch 6 - iter 2250/7500 - loss 0.20608536 - time (sec): 275.24 - samples/sec: 264.05 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 07:37:54,486 epoch 6 - iter 3000/7500 - loss 0.21411200 - time (sec): 368.40 - samples/sec: 261.45 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 07:39:27,386 epoch 6 - iter 3750/7500 - loss 0.21815036 - time (sec): 461.30 - samples/sec: 260.18 - lr: 0.000003 - momentum: 0.000000 |
|
2023-11-16 07:41:00,912 epoch 6 - iter 4500/7500 - loss 0.21725635 - time (sec): 554.83 - samples/sec: 260.18 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 07:42:34,526 epoch 6 - iter 5250/7500 - loss 0.21942273 - time (sec): 648.44 - samples/sec: 259.04 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 07:44:07,951 epoch 6 - iter 6000/7500 - loss 0.22107059 - time (sec): 741.87 - samples/sec: 258.58 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 07:45:41,437 epoch 6 - iter 6750/7500 - loss 0.22258724 - time (sec): 835.35 - samples/sec: 258.83 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 07:47:12,427 epoch 6 - iter 7500/7500 - loss 0.22153847 - time (sec): 926.34 - samples/sec: 259.94 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 07:47:12,430 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 07:47:12,430 EPOCH 6 done: loss 0.2215 - lr: 0.000002 |
|
2023-11-16 07:47:39,790 DEV : loss 0.2961623966693878 - f1-score (micro avg) 0.9003 |
|
2023-11-16 07:47:42,072 saving best model |
|
2023-11-16 07:47:44,511 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 07:49:18,880 epoch 7 - iter 750/7500 - loss 0.16803306 - time (sec): 94.36 - samples/sec: 255.20 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 07:50:54,973 epoch 7 - iter 1500/7500 - loss 0.17324952 - time (sec): 190.46 - samples/sec: 254.75 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 07:52:32,192 epoch 7 - iter 2250/7500 - loss 0.17809510 - time (sec): 287.68 - samples/sec: 251.96 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 07:54:06,050 epoch 7 - iter 3000/7500 - loss 0.18157709 - time (sec): 381.53 - samples/sec: 252.63 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 07:55:39,777 epoch 7 - iter 3750/7500 - loss 0.18010115 - time (sec): 475.26 - samples/sec: 252.98 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 07:57:12,748 epoch 7 - iter 4500/7500 - loss 0.18253655 - time (sec): 568.23 - samples/sec: 253.84 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 07:58:45,045 epoch 7 - iter 5250/7500 - loss 0.18478993 - time (sec): 660.53 - samples/sec: 254.99 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 08:00:17,924 epoch 7 - iter 6000/7500 - loss 0.18257351 - time (sec): 753.41 - samples/sec: 255.75 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 08:01:49,343 epoch 7 - iter 6750/7500 - loss 0.18422323 - time (sec): 844.83 - samples/sec: 256.40 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 08:03:22,527 epoch 7 - iter 7500/7500 - loss 0.18484974 - time (sec): 938.01 - samples/sec: 256.71 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 08:03:22,531 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 08:03:22,531 EPOCH 7 done: loss 0.1848 - lr: 0.000002 |
|
2023-11-16 08:03:48,970 DEV : loss 0.305960088968277 - f1-score (micro avg) 0.9028 |
|
2023-11-16 08:03:51,887 saving best model |
|
2023-11-16 08:03:53,942 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 08:05:28,806 epoch 8 - iter 750/7500 - loss 0.14648152 - time (sec): 94.86 - samples/sec: 244.77 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 08:07:04,823 epoch 8 - iter 1500/7500 - loss 0.15989226 - time (sec): 190.88 - samples/sec: 250.15 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 08:08:37,702 epoch 8 - iter 2250/7500 - loss 0.16196706 - time (sec): 283.76 - samples/sec: 254.57 - lr: 0.000002 - momentum: 0.000000 |
|
2023-11-16 08:10:09,006 epoch 8 - iter 3000/7500 - loss 0.16121972 - time (sec): 375.06 - samples/sec: 257.26 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 08:11:41,390 epoch 8 - iter 3750/7500 - loss 0.15974733 - time (sec): 467.45 - samples/sec: 257.59 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 08:13:14,066 epoch 8 - iter 4500/7500 - loss 0.15727904 - time (sec): 560.12 - samples/sec: 258.26 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 08:14:47,177 epoch 8 - iter 5250/7500 - loss 0.15597106 - time (sec): 653.23 - samples/sec: 257.92 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 08:16:17,918 epoch 8 - iter 6000/7500 - loss 0.15441827 - time (sec): 743.97 - samples/sec: 258.11 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 08:17:52,222 epoch 8 - iter 6750/7500 - loss 0.15283100 - time (sec): 838.28 - samples/sec: 258.17 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 08:19:25,488 epoch 8 - iter 7500/7500 - loss 0.15507668 - time (sec): 931.54 - samples/sec: 258.49 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 08:19:25,491 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 08:19:25,491 EPOCH 8 done: loss 0.1551 - lr: 0.000001 |
|
2023-11-16 08:19:53,344 DEV : loss 0.3231204152107239 - f1-score (micro avg) 0.9014 |
|
2023-11-16 08:19:55,450 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 08:21:28,400 epoch 9 - iter 750/7500 - loss 0.12523890 - time (sec): 92.95 - samples/sec: 258.76 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 08:23:00,060 epoch 9 - iter 1500/7500 - loss 0.12801485 - time (sec): 184.61 - samples/sec: 263.03 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 08:24:36,577 epoch 9 - iter 2250/7500 - loss 0.13158450 - time (sec): 281.12 - samples/sec: 255.88 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 08:26:08,372 epoch 9 - iter 3000/7500 - loss 0.12955430 - time (sec): 372.92 - samples/sec: 256.96 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 08:27:42,811 epoch 9 - iter 3750/7500 - loss 0.13110177 - time (sec): 467.36 - samples/sec: 256.70 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 08:29:15,844 epoch 9 - iter 4500/7500 - loss 0.13696235 - time (sec): 560.39 - samples/sec: 256.51 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 08:30:48,381 epoch 9 - iter 5250/7500 - loss 0.13444283 - time (sec): 652.93 - samples/sec: 256.96 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 08:32:21,725 epoch 9 - iter 6000/7500 - loss 0.13580845 - time (sec): 746.27 - samples/sec: 258.30 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 08:33:54,750 epoch 9 - iter 6750/7500 - loss 0.13419816 - time (sec): 839.30 - samples/sec: 258.02 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 08:35:28,480 epoch 9 - iter 7500/7500 - loss 0.13459907 - time (sec): 933.03 - samples/sec: 258.08 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 08:35:28,482 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 08:35:28,483 EPOCH 9 done: loss 0.1346 - lr: 0.000001 |
|
2023-11-16 08:35:55,875 DEV : loss 0.3105945885181427 - f1-score (micro avg) 0.9036 |
|
2023-11-16 08:35:58,080 saving best model |
|
2023-11-16 08:36:01,035 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 08:37:36,044 epoch 10 - iter 750/7500 - loss 0.10551800 - time (sec): 95.01 - samples/sec: 248.68 - lr: 0.000001 - momentum: 0.000000 |
|
2023-11-16 08:39:09,059 epoch 10 - iter 1500/7500 - loss 0.11970928 - time (sec): 188.02 - samples/sec: 251.27 - lr: 0.000000 - momentum: 0.000000 |
|
2023-11-16 08:40:42,311 epoch 10 - iter 2250/7500 - loss 0.12199666 - time (sec): 281.27 - samples/sec: 256.11 - lr: 0.000000 - momentum: 0.000000 |
|
2023-11-16 08:42:13,894 epoch 10 - iter 3000/7500 - loss 0.12112190 - time (sec): 372.86 - samples/sec: 257.41 - lr: 0.000000 - momentum: 0.000000 |
|
2023-11-16 08:43:44,971 epoch 10 - iter 3750/7500 - loss 0.12198423 - time (sec): 463.93 - samples/sec: 259.71 - lr: 0.000000 - momentum: 0.000000 |
|
2023-11-16 08:45:19,480 epoch 10 - iter 4500/7500 - loss 0.11644070 - time (sec): 558.44 - samples/sec: 259.28 - lr: 0.000000 - momentum: 0.000000 |
|
2023-11-16 08:46:52,528 epoch 10 - iter 5250/7500 - loss 0.12094725 - time (sec): 651.49 - samples/sec: 259.32 - lr: 0.000000 - momentum: 0.000000 |
|
2023-11-16 08:48:29,025 epoch 10 - iter 6000/7500 - loss 0.11921992 - time (sec): 747.99 - samples/sec: 257.59 - lr: 0.000000 - momentum: 0.000000 |
|
2023-11-16 08:50:06,215 epoch 10 - iter 6750/7500 - loss 0.11723856 - time (sec): 845.18 - samples/sec: 256.47 - lr: 0.000000 - momentum: 0.000000 |
|
2023-11-16 08:51:43,737 epoch 10 - iter 7500/7500 - loss 0.11691516 - time (sec): 942.70 - samples/sec: 255.43 - lr: 0.000000 - momentum: 0.000000 |
|
2023-11-16 08:51:43,740 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 08:51:43,740 EPOCH 10 done: loss 0.1169 - lr: 0.000000 |
|
2023-11-16 08:52:11,462 DEV : loss 0.3263167440891266 - f1-score (micro avg) 0.905 |
|
2023-11-16 08:52:14,084 saving best model |
|
2023-11-16 08:52:19,334 ---------------------------------------------------------------------------------------------------- |
|
2023-11-16 08:52:19,337 Loading model from best epoch ... |
|
2023-11-16 08:52:29,363 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-PER, B-PER, E-PER, I-PER |
|
2023-11-16 08:52:58,009 |
|
Results: |
|
- F-score (micro) 0.9038 |
|
- F-score (macro) 0.9028 |
|
- Accuracy 0.8536 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.9015 0.9153 0.9083 5288 |
|
PER 0.9219 0.9417 0.9317 3962 |
|
ORG 0.8674 0.8692 0.8683 3807 |
|
|
|
micro avg 0.8979 0.9099 0.9038 13057 |
|
macro avg 0.8969 0.9087 0.9028 13057 |
|
weighted avg 0.8977 0.9099 0.9037 13057 |
|
|
|
2023-11-16 08:52:58,009 ---------------------------------------------------------------------------------------------------- |
|
|