|
2023-10-12 08:04:12,035 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 08:04:12,037 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=13, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-12 08:04:12,038 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 08:04:12,038 MultiCorpus: 7936 train + 992 dev + 992 test sentences |
|
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr |
|
2023-10-12 08:04:12,038 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 08:04:12,038 Train: 7936 sentences |
|
2023-10-12 08:04:12,038 (train_with_dev=False, train_with_test=False) |
|
2023-10-12 08:04:12,038 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 08:04:12,038 Training Params: |
|
2023-10-12 08:04:12,039 - learning_rate: "0.00015" |
|
2023-10-12 08:04:12,039 - mini_batch_size: "8" |
|
2023-10-12 08:04:12,039 - max_epochs: "10" |
|
2023-10-12 08:04:12,039 - shuffle: "True" |
|
2023-10-12 08:04:12,039 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 08:04:12,039 Plugins: |
|
2023-10-12 08:04:12,039 - TensorboardLogger |
|
2023-10-12 08:04:12,039 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-12 08:04:12,039 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 08:04:12,039 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-12 08:04:12,039 - metric: "('micro avg', 'f1-score')" |
|
2023-10-12 08:04:12,039 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 08:04:12,039 Computation: |
|
2023-10-12 08:04:12,040 - compute on device: cuda:0 |
|
2023-10-12 08:04:12,040 - embedding storage: none |
|
2023-10-12 08:04:12,040 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 08:04:12,040 Model training base path: "hmbench-icdar/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-12 08:04:12,040 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 08:04:12,040 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 08:04:12,040 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-12 08:05:06,449 epoch 1 - iter 99/992 - loss 2.58555570 - time (sec): 54.41 - samples/sec: 284.28 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-12 08:05:56,282 epoch 1 - iter 198/992 - loss 2.53686260 - time (sec): 104.24 - samples/sec: 302.44 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-12 08:06:46,233 epoch 1 - iter 297/992 - loss 2.34204530 - time (sec): 154.19 - samples/sec: 310.69 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-12 08:07:34,661 epoch 1 - iter 396/992 - loss 2.08169120 - time (sec): 202.62 - samples/sec: 317.45 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-12 08:08:24,602 epoch 1 - iter 495/992 - loss 1.82694454 - time (sec): 252.56 - samples/sec: 317.85 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-12 08:09:14,534 epoch 1 - iter 594/992 - loss 1.59247411 - time (sec): 302.49 - samples/sec: 320.69 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-12 08:10:03,550 epoch 1 - iter 693/992 - loss 1.39639771 - time (sec): 351.51 - samples/sec: 325.34 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-12 08:10:59,316 epoch 1 - iter 792/992 - loss 1.25466641 - time (sec): 407.27 - samples/sec: 321.81 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-12 08:11:53,277 epoch 1 - iter 891/992 - loss 1.14679909 - time (sec): 461.23 - samples/sec: 319.44 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-12 08:12:42,482 epoch 1 - iter 990/992 - loss 1.05527957 - time (sec): 510.44 - samples/sec: 320.66 - lr: 0.000150 - momentum: 0.000000 |
|
2023-10-12 08:12:43,887 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 08:12:43,888 EPOCH 1 done: loss 1.0537 - lr: 0.000150 |
|
2023-10-12 08:13:10,955 DEV : loss 0.18303227424621582 - f1-score (micro avg) 0.355 |
|
2023-10-12 08:13:11,005 saving best model |
|
2023-10-12 08:13:11,967 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 08:14:02,123 epoch 2 - iter 99/992 - loss 0.24094875 - time (sec): 50.15 - samples/sec: 328.44 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-12 08:14:55,551 epoch 2 - iter 198/992 - loss 0.20476026 - time (sec): 103.58 - samples/sec: 315.61 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-12 08:15:49,201 epoch 2 - iter 297/992 - loss 0.19182712 - time (sec): 157.23 - samples/sec: 311.94 - lr: 0.000145 - momentum: 0.000000 |
|
2023-10-12 08:16:42,272 epoch 2 - iter 396/992 - loss 0.18719035 - time (sec): 210.30 - samples/sec: 312.52 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-12 08:17:36,327 epoch 2 - iter 495/992 - loss 0.18025399 - time (sec): 264.36 - samples/sec: 311.59 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-12 08:18:27,193 epoch 2 - iter 594/992 - loss 0.17568004 - time (sec): 315.22 - samples/sec: 312.61 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-12 08:19:21,888 epoch 2 - iter 693/992 - loss 0.17273836 - time (sec): 369.92 - samples/sec: 311.10 - lr: 0.000138 - momentum: 0.000000 |
|
2023-10-12 08:20:16,613 epoch 2 - iter 792/992 - loss 0.16679007 - time (sec): 424.64 - samples/sec: 308.04 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-12 08:21:07,673 epoch 2 - iter 891/992 - loss 0.16155770 - time (sec): 475.70 - samples/sec: 308.99 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-12 08:22:02,055 epoch 2 - iter 990/992 - loss 0.15729141 - time (sec): 530.08 - samples/sec: 308.45 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-12 08:22:03,276 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 08:22:03,276 EPOCH 2 done: loss 0.1570 - lr: 0.000133 |
|
2023-10-12 08:22:30,282 DEV : loss 0.0930715873837471 - f1-score (micro avg) 0.7059 |
|
2023-10-12 08:22:30,322 saving best model |
|
2023-10-12 08:22:33,238 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 08:23:31,054 epoch 3 - iter 99/992 - loss 0.09366490 - time (sec): 57.81 - samples/sec: 272.24 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-12 08:24:26,452 epoch 3 - iter 198/992 - loss 0.09371713 - time (sec): 113.21 - samples/sec: 282.27 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-12 08:25:20,392 epoch 3 - iter 297/992 - loss 0.09379383 - time (sec): 167.15 - samples/sec: 291.97 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-12 08:26:13,353 epoch 3 - iter 396/992 - loss 0.09391945 - time (sec): 220.11 - samples/sec: 296.14 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-12 08:27:04,613 epoch 3 - iter 495/992 - loss 0.09225266 - time (sec): 271.37 - samples/sec: 299.92 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-12 08:27:58,875 epoch 3 - iter 594/992 - loss 0.09074357 - time (sec): 325.63 - samples/sec: 299.29 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-12 08:28:48,568 epoch 3 - iter 693/992 - loss 0.09041959 - time (sec): 375.33 - samples/sec: 301.62 - lr: 0.000122 - momentum: 0.000000 |
|
2023-10-12 08:29:39,210 epoch 3 - iter 792/992 - loss 0.08851234 - time (sec): 425.97 - samples/sec: 307.21 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-12 08:30:29,714 epoch 3 - iter 891/992 - loss 0.08699505 - time (sec): 476.47 - samples/sec: 310.07 - lr: 0.000118 - momentum: 0.000000 |
|
2023-10-12 08:31:24,309 epoch 3 - iter 990/992 - loss 0.08693648 - time (sec): 531.07 - samples/sec: 308.23 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-12 08:31:25,444 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 08:31:25,445 EPOCH 3 done: loss 0.0869 - lr: 0.000117 |
|
2023-10-12 08:31:51,548 DEV : loss 0.09189649671316147 - f1-score (micro avg) 0.7402 |
|
2023-10-12 08:31:51,594 saving best model |
|
2023-10-12 08:31:54,213 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 08:32:43,918 epoch 4 - iter 99/992 - loss 0.06142325 - time (sec): 49.70 - samples/sec: 344.89 - lr: 0.000115 - momentum: 0.000000 |
|
2023-10-12 08:33:36,309 epoch 4 - iter 198/992 - loss 0.06089639 - time (sec): 102.09 - samples/sec: 333.62 - lr: 0.000113 - momentum: 0.000000 |
|
2023-10-12 08:34:25,710 epoch 4 - iter 297/992 - loss 0.06267594 - time (sec): 151.49 - samples/sec: 329.91 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-12 08:35:15,366 epoch 4 - iter 396/992 - loss 0.06005022 - time (sec): 201.15 - samples/sec: 326.69 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-12 08:36:07,476 epoch 4 - iter 495/992 - loss 0.05963130 - time (sec): 253.26 - samples/sec: 323.68 - lr: 0.000108 - momentum: 0.000000 |
|
2023-10-12 08:36:57,931 epoch 4 - iter 594/992 - loss 0.05891404 - time (sec): 303.71 - samples/sec: 322.82 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-12 08:37:46,481 epoch 4 - iter 693/992 - loss 0.05887390 - time (sec): 352.26 - samples/sec: 325.28 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-12 08:38:34,872 epoch 4 - iter 792/992 - loss 0.05887289 - time (sec): 400.65 - samples/sec: 326.32 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-12 08:39:29,526 epoch 4 - iter 891/992 - loss 0.05755141 - time (sec): 455.31 - samples/sec: 324.49 - lr: 0.000102 - momentum: 0.000000 |
|
2023-10-12 08:40:24,825 epoch 4 - iter 990/992 - loss 0.05782701 - time (sec): 510.61 - samples/sec: 320.69 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-12 08:40:25,816 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 08:40:25,816 EPOCH 4 done: loss 0.0578 - lr: 0.000100 |
|
2023-10-12 08:40:51,595 DEV : loss 0.09931203722953796 - f1-score (micro avg) 0.7623 |
|
2023-10-12 08:40:51,635 saving best model |
|
2023-10-12 08:40:57,442 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 08:41:49,152 epoch 5 - iter 99/992 - loss 0.04436841 - time (sec): 51.71 - samples/sec: 312.44 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-12 08:42:42,375 epoch 5 - iter 198/992 - loss 0.03706072 - time (sec): 104.93 - samples/sec: 308.20 - lr: 0.000097 - momentum: 0.000000 |
|
2023-10-12 08:43:35,907 epoch 5 - iter 297/992 - loss 0.03821323 - time (sec): 158.46 - samples/sec: 307.02 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-12 08:44:31,403 epoch 5 - iter 396/992 - loss 0.03912269 - time (sec): 213.96 - samples/sec: 304.26 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-12 08:45:22,108 epoch 5 - iter 495/992 - loss 0.03917046 - time (sec): 264.66 - samples/sec: 307.19 - lr: 0.000092 - momentum: 0.000000 |
|
2023-10-12 08:46:11,499 epoch 5 - iter 594/992 - loss 0.04022926 - time (sec): 314.05 - samples/sec: 311.81 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-12 08:46:59,993 epoch 5 - iter 693/992 - loss 0.03989622 - time (sec): 362.55 - samples/sec: 315.65 - lr: 0.000088 - momentum: 0.000000 |
|
2023-10-12 08:47:58,985 epoch 5 - iter 792/992 - loss 0.04056929 - time (sec): 421.54 - samples/sec: 311.55 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-12 08:48:50,892 epoch 5 - iter 891/992 - loss 0.04088817 - time (sec): 473.45 - samples/sec: 312.27 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-12 08:49:38,976 epoch 5 - iter 990/992 - loss 0.04156030 - time (sec): 521.53 - samples/sec: 313.73 - lr: 0.000083 - momentum: 0.000000 |
|
2023-10-12 08:49:40,070 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 08:49:40,071 EPOCH 5 done: loss 0.0415 - lr: 0.000083 |
|
2023-10-12 08:50:06,253 DEV : loss 0.11372340470552444 - f1-score (micro avg) 0.756 |
|
2023-10-12 08:50:06,293 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 08:50:55,831 epoch 6 - iter 99/992 - loss 0.02534475 - time (sec): 49.54 - samples/sec: 316.24 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-12 08:51:50,289 epoch 6 - iter 198/992 - loss 0.02728538 - time (sec): 103.99 - samples/sec: 307.91 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-12 08:52:44,094 epoch 6 - iter 297/992 - loss 0.02693384 - time (sec): 157.80 - samples/sec: 305.30 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-12 08:53:36,994 epoch 6 - iter 396/992 - loss 0.02900133 - time (sec): 210.70 - samples/sec: 309.30 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-12 08:54:29,419 epoch 6 - iter 495/992 - loss 0.02831503 - time (sec): 263.12 - samples/sec: 308.36 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-12 08:55:19,177 epoch 6 - iter 594/992 - loss 0.02808324 - time (sec): 312.88 - samples/sec: 312.68 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-12 08:56:07,273 epoch 6 - iter 693/992 - loss 0.02892834 - time (sec): 360.98 - samples/sec: 317.73 - lr: 0.000072 - momentum: 0.000000 |
|
2023-10-12 08:56:55,649 epoch 6 - iter 792/992 - loss 0.03069744 - time (sec): 409.35 - samples/sec: 319.35 - lr: 0.000070 - momentum: 0.000000 |
|
2023-10-12 08:57:50,900 epoch 6 - iter 891/992 - loss 0.03124301 - time (sec): 464.60 - samples/sec: 317.11 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-12 08:58:43,506 epoch 6 - iter 990/992 - loss 0.03144417 - time (sec): 517.21 - samples/sec: 316.34 - lr: 0.000067 - momentum: 0.000000 |
|
2023-10-12 08:58:44,493 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 08:58:44,493 EPOCH 6 done: loss 0.0314 - lr: 0.000067 |
|
2023-10-12 08:59:08,717 DEV : loss 0.13427288830280304 - f1-score (micro avg) 0.7743 |
|
2023-10-12 08:59:08,761 saving best model |
|
2023-10-12 08:59:11,805 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 09:00:02,976 epoch 7 - iter 99/992 - loss 0.01884774 - time (sec): 51.17 - samples/sec: 318.71 - lr: 0.000065 - momentum: 0.000000 |
|
2023-10-12 09:00:51,080 epoch 7 - iter 198/992 - loss 0.02301048 - time (sec): 99.27 - samples/sec: 332.12 - lr: 0.000063 - momentum: 0.000000 |
|
2023-10-12 09:01:38,861 epoch 7 - iter 297/992 - loss 0.02287236 - time (sec): 147.05 - samples/sec: 332.81 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-12 09:02:26,775 epoch 7 - iter 396/992 - loss 0.02348912 - time (sec): 194.97 - samples/sec: 336.71 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-12 09:03:14,307 epoch 7 - iter 495/992 - loss 0.02304502 - time (sec): 242.50 - samples/sec: 336.33 - lr: 0.000058 - momentum: 0.000000 |
|
2023-10-12 09:04:01,858 epoch 7 - iter 594/992 - loss 0.02320461 - time (sec): 290.05 - samples/sec: 337.37 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-12 09:04:51,398 epoch 7 - iter 693/992 - loss 0.02410117 - time (sec): 339.59 - samples/sec: 337.92 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-12 09:05:37,198 epoch 7 - iter 792/992 - loss 0.02467991 - time (sec): 385.39 - samples/sec: 336.38 - lr: 0.000053 - momentum: 0.000000 |
|
2023-10-12 09:06:24,113 epoch 7 - iter 891/992 - loss 0.02411616 - time (sec): 432.30 - samples/sec: 338.45 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-12 09:07:11,294 epoch 7 - iter 990/992 - loss 0.02385416 - time (sec): 479.48 - samples/sec: 341.20 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-12 09:07:12,268 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 09:07:12,268 EPOCH 7 done: loss 0.0238 - lr: 0.000050 |
|
2023-10-12 09:07:37,616 DEV : loss 0.16945815086364746 - f1-score (micro avg) 0.7625 |
|
2023-10-12 09:07:37,658 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 09:08:25,636 epoch 8 - iter 99/992 - loss 0.01826581 - time (sec): 47.98 - samples/sec: 352.82 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-12 09:09:15,818 epoch 8 - iter 198/992 - loss 0.01797263 - time (sec): 98.16 - samples/sec: 328.55 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-12 09:10:08,897 epoch 8 - iter 297/992 - loss 0.01946032 - time (sec): 151.24 - samples/sec: 315.74 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-12 09:10:58,922 epoch 8 - iter 396/992 - loss 0.01952599 - time (sec): 201.26 - samples/sec: 316.86 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-12 09:11:52,850 epoch 8 - iter 495/992 - loss 0.01857797 - time (sec): 255.19 - samples/sec: 315.56 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-12 09:12:45,735 epoch 8 - iter 594/992 - loss 0.02004084 - time (sec): 308.08 - samples/sec: 317.70 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-12 09:13:36,327 epoch 8 - iter 693/992 - loss 0.02014586 - time (sec): 358.67 - samples/sec: 317.68 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-12 09:14:30,138 epoch 8 - iter 792/992 - loss 0.01980906 - time (sec): 412.48 - samples/sec: 316.70 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-12 09:15:21,623 epoch 8 - iter 891/992 - loss 0.02006957 - time (sec): 463.96 - samples/sec: 316.10 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-12 09:16:09,180 epoch 8 - iter 990/992 - loss 0.01945183 - time (sec): 511.52 - samples/sec: 320.12 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-12 09:16:10,068 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 09:16:10,068 EPOCH 8 done: loss 0.0194 - lr: 0.000033 |
|
2023-10-12 09:16:34,568 DEV : loss 0.1777261346578598 - f1-score (micro avg) 0.7603 |
|
2023-10-12 09:16:34,606 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 09:17:22,132 epoch 9 - iter 99/992 - loss 0.02532699 - time (sec): 47.52 - samples/sec: 362.18 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-12 09:18:10,391 epoch 9 - iter 198/992 - loss 0.02156891 - time (sec): 95.78 - samples/sec: 351.32 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-12 09:18:56,755 epoch 9 - iter 297/992 - loss 0.01810471 - time (sec): 142.15 - samples/sec: 356.12 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-12 09:19:44,249 epoch 9 - iter 396/992 - loss 0.01799543 - time (sec): 189.64 - samples/sec: 348.96 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-12 09:20:31,198 epoch 9 - iter 495/992 - loss 0.01649332 - time (sec): 236.59 - samples/sec: 349.44 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-12 09:21:19,342 epoch 9 - iter 594/992 - loss 0.01544692 - time (sec): 284.73 - samples/sec: 347.94 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-12 09:22:07,048 epoch 9 - iter 693/992 - loss 0.01478198 - time (sec): 332.44 - samples/sec: 348.11 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-12 09:22:55,379 epoch 9 - iter 792/992 - loss 0.01567478 - time (sec): 380.77 - samples/sec: 344.64 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-12 09:23:42,926 epoch 9 - iter 891/992 - loss 0.01599589 - time (sec): 428.32 - samples/sec: 344.01 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-12 09:24:31,114 epoch 9 - iter 990/992 - loss 0.01558416 - time (sec): 476.51 - samples/sec: 343.42 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-12 09:24:32,095 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 09:24:32,095 EPOCH 9 done: loss 0.0156 - lr: 0.000017 |
|
2023-10-12 09:24:57,467 DEV : loss 0.18520045280456543 - f1-score (micro avg) 0.7619 |
|
2023-10-12 09:24:57,511 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 09:25:46,427 epoch 10 - iter 99/992 - loss 0.01050991 - time (sec): 48.91 - samples/sec: 341.19 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-12 09:26:34,054 epoch 10 - iter 198/992 - loss 0.01124717 - time (sec): 96.54 - samples/sec: 339.42 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-12 09:27:22,983 epoch 10 - iter 297/992 - loss 0.01103351 - time (sec): 145.47 - samples/sec: 339.57 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-12 09:28:15,052 epoch 10 - iter 396/992 - loss 0.01233967 - time (sec): 197.54 - samples/sec: 333.82 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-12 09:29:10,897 epoch 10 - iter 495/992 - loss 0.01168010 - time (sec): 253.38 - samples/sec: 325.72 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-12 09:30:07,424 epoch 10 - iter 594/992 - loss 0.01179804 - time (sec): 309.91 - samples/sec: 317.53 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-12 09:31:02,969 epoch 10 - iter 693/992 - loss 0.01193022 - time (sec): 365.46 - samples/sec: 312.37 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-12 09:31:59,360 epoch 10 - iter 792/992 - loss 0.01236357 - time (sec): 421.85 - samples/sec: 309.60 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-12 09:32:52,374 epoch 10 - iter 891/992 - loss 0.01270245 - time (sec): 474.86 - samples/sec: 310.06 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-12 09:33:40,701 epoch 10 - iter 990/992 - loss 0.01318705 - time (sec): 523.19 - samples/sec: 313.02 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-12 09:33:41,597 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 09:33:41,597 EPOCH 10 done: loss 0.0134 - lr: 0.000000 |
|
2023-10-12 09:34:08,746 DEV : loss 0.19303283095359802 - f1-score (micro avg) 0.7562 |
|
2023-10-12 09:34:09,762 ---------------------------------------------------------------------------------------------------- |
|
2023-10-12 09:34:09,764 Loading model from best epoch ... |
|
2023-10-12 09:34:15,360 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-12 09:34:40,369 |
|
Results: |
|
- F-score (micro) 0.7486 |
|
- F-score (macro) 0.6567 |
|
- Accuracy 0.6255 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8082 0.8107 0.8095 655 |
|
PER 0.6795 0.7892 0.7303 223 |
|
ORG 0.5000 0.3780 0.4305 127 |
|
|
|
micro avg 0.7460 0.7512 0.7486 1005 |
|
macro avg 0.6626 0.6593 0.6567 1005 |
|
weighted avg 0.7407 0.7512 0.7440 1005 |
|
|
|
2023-10-12 09:34:40,369 ---------------------------------------------------------------------------------------------------- |
|
|