dobbersc's picture
Add de2en and en2de models
c89209e verified
2024-07-29 08:00:18,857 ----------------------------------------------------------------------------------------------------
2024-07-29 08:00:18,857 Training Model
2024-07-29 08:00:18,857 ----------------------------------------------------------------------------------------------------
2024-07-29 08:00:18,857 Translator(
(encoder): EncoderLSTM(
(embedding): Embedding(112, 300, padding_idx=0)
(dropout): Dropout(p=0.1, inplace=False)
(lstm): LSTM(300, 512, batch_first=True, bidirectional=True)
)
(decoder): DecoderLSTM(
(embedding): Embedding(114, 300, padding_idx=0)
(dropout): Dropout(p=0.1, inplace=False)
(lstm): LSTM(300, 1024, batch_first=True)
(hidden2vocab): Linear(in_features=1024, out_features=114, bias=True)
(log_softmax): LogSoftmax(dim=-1)
)
)
2024-07-29 08:00:18,857 ----------------------------------------------------------------------------------------------------
2024-07-29 08:00:18,857 Training Hyperparameters:
2024-07-29 08:00:18,857 - max_epochs: 10
2024-07-29 08:00:18,857 - learning_rate: 0.001
2024-07-29 08:00:18,857 - batch_size: 128
2024-07-29 08:00:18,857 - patience: 5
2024-07-29 08:00:18,857 - scheduler_patience: 3
2024-07-29 08:00:18,857 - teacher_forcing_ratio: 0.5
2024-07-29 08:00:18,857 ----------------------------------------------------------------------------------------------------
2024-07-29 08:00:18,857 Computational Parameters:
2024-07-29 08:00:18,857 - num_workers: 4
2024-07-29 08:00:18,857 - device: device(type='cuda', index=0)
2024-07-29 08:00:18,857 ----------------------------------------------------------------------------------------------------
2024-07-29 08:00:18,857 Dataset Splits:
2024-07-29 08:00:18,857 - train: 133623 data points
2024-07-29 08:00:18,857 - dev: 19090 data points
2024-07-29 08:00:18,857 - test: 38179 data points
2024-07-29 08:00:18,857 ----------------------------------------------------------------------------------------------------
2024-07-29 08:00:18,857 EPOCH 1
2024-07-29 08:01:11,410 batch 104/1044 - loss 2.76805122 - lr 0.0010 - time 52.55s
2024-07-29 08:02:04,056 batch 208/1044 - loss 2.64841842 - lr 0.0010 - time 105.20s
2024-07-29 08:02:56,265 batch 312/1044 - loss 2.57898955 - lr 0.0010 - time 157.41s
2024-07-29 08:03:46,595 batch 416/1044 - loss 2.53192030 - lr 0.0010 - time 207.74s
2024-07-29 08:04:40,746 batch 520/1044 - loss 2.49861492 - lr 0.0010 - time 261.89s
2024-07-29 08:05:33,770 batch 624/1044 - loss 2.46931939 - lr 0.0010 - time 314.91s
2024-07-29 08:06:24,759 batch 728/1044 - loss 2.44438299 - lr 0.0010 - time 365.90s
2024-07-29 08:07:15,000 batch 832/1044 - loss 2.42292007 - lr 0.0010 - time 416.14s
2024-07-29 08:08:10,355 batch 936/1044 - loss 2.40563072 - lr 0.0010 - time 471.50s
2024-07-29 08:09:01,448 batch 1040/1044 - loss 2.38826131 - lr 0.0010 - time 522.59s
2024-07-29 08:09:03,325 ----------------------------------------------------------------------------------------------------
2024-07-29 08:09:03,326 EPOCH 1 DONE
2024-07-29 08:09:29,447 TRAIN Loss: 2.3877
2024-07-29 08:09:29,447 DEV Loss: 3.3787
2024-07-29 08:09:29,447 DEV Perplexity: 29.3338
2024-07-29 08:09:29,447 New best score!
2024-07-29 08:09:29,448 ----------------------------------------------------------------------------------------------------
2024-07-29 08:09:29,448 EPOCH 2
2024-07-29 08:10:22,147 batch 104/1044 - loss 2.22209186 - lr 0.0010 - time 52.70s
2024-07-29 08:11:14,838 batch 208/1044 - loss 2.21444971 - lr 0.0010 - time 105.39s
2024-07-29 08:12:04,854 batch 312/1044 - loss 2.20841359 - lr 0.0010 - time 155.41s
2024-07-29 08:12:56,469 batch 416/1044 - loss 2.20362168 - lr 0.0010 - time 207.02s
2024-07-29 08:13:51,217 batch 520/1044 - loss 2.19630995 - lr 0.0010 - time 261.77s
2024-07-29 08:14:44,604 batch 624/1044 - loss 2.19130721 - lr 0.0010 - time 315.16s
2024-07-29 08:15:36,532 batch 728/1044 - loss 2.19013349 - lr 0.0010 - time 367.08s
2024-07-29 08:16:29,169 batch 832/1044 - loss 2.18569613 - lr 0.0010 - time 419.72s
2024-07-29 08:17:19,487 batch 936/1044 - loss 2.18181469 - lr 0.0010 - time 470.04s
2024-07-29 08:18:11,669 batch 1040/1044 - loss 2.17839243 - lr 0.0010 - time 522.22s
2024-07-29 08:18:13,741 ----------------------------------------------------------------------------------------------------
2024-07-29 08:18:13,742 EPOCH 2 DONE
2024-07-29 08:18:39,641 TRAIN Loss: 2.1782
2024-07-29 08:18:39,641 DEV Loss: 3.3972
2024-07-29 08:18:39,641 DEV Perplexity: 29.8794
2024-07-29 08:18:39,641 No improvement for 1 epoch(s)
2024-07-29 08:18:39,641 ----------------------------------------------------------------------------------------------------
2024-07-29 08:18:39,641 EPOCH 3
2024-07-29 08:19:31,378 batch 104/1044 - loss 2.12722631 - lr 0.0010 - time 51.74s
2024-07-29 08:20:24,836 batch 208/1044 - loss 2.13235184 - lr 0.0010 - time 105.19s
2024-07-29 08:21:17,271 batch 312/1044 - loss 2.13028342 - lr 0.0010 - time 157.63s
2024-07-29 08:22:07,107 batch 416/1044 - loss 2.12492348 - lr 0.0010 - time 207.47s
2024-07-29 08:22:57,615 batch 520/1044 - loss 2.12266913 - lr 0.0010 - time 257.97s
2024-07-29 08:23:51,655 batch 624/1044 - loss 2.12023626 - lr 0.0010 - time 312.01s
2024-07-29 08:24:46,620 batch 728/1044 - loss 2.11750023 - lr 0.0010 - time 366.98s
2024-07-29 08:25:39,463 batch 832/1044 - loss 2.11486574 - lr 0.0010 - time 419.82s
2024-07-29 08:26:30,363 batch 936/1044 - loss 2.11231703 - lr 0.0010 - time 470.72s
2024-07-29 08:27:22,767 batch 1040/1044 - loss 2.10992234 - lr 0.0010 - time 523.13s
2024-07-29 08:27:24,801 ----------------------------------------------------------------------------------------------------
2024-07-29 08:27:24,803 EPOCH 3 DONE
2024-07-29 08:27:50,628 TRAIN Loss: 2.1095
2024-07-29 08:27:50,628 DEV Loss: 3.5338
2024-07-29 08:27:50,628 DEV Perplexity: 34.2544
2024-07-29 08:27:50,628 No improvement for 2 epoch(s)
2024-07-29 08:27:50,628 ----------------------------------------------------------------------------------------------------
2024-07-29 08:27:50,628 EPOCH 4
2024-07-29 08:28:44,890 batch 104/1044 - loss 2.07187447 - lr 0.0010 - time 54.26s
2024-07-29 08:29:36,382 batch 208/1044 - loss 2.07457917 - lr 0.0010 - time 105.75s
2024-07-29 08:30:27,957 batch 312/1044 - loss 2.07431510 - lr 0.0010 - time 157.33s
2024-07-29 08:31:17,958 batch 416/1044 - loss 2.07086038 - lr 0.0010 - time 207.33s
2024-07-29 08:32:08,523 batch 520/1044 - loss 2.06997228 - lr 0.0010 - time 257.89s
2024-07-29 08:33:02,282 batch 624/1044 - loss 2.06882375 - lr 0.0010 - time 311.65s
2024-07-29 08:33:53,334 batch 728/1044 - loss 2.06778251 - lr 0.0010 - time 362.71s
2024-07-29 08:34:44,960 batch 832/1044 - loss 2.06494840 - lr 0.0010 - time 414.33s
2024-07-29 08:35:33,596 batch 936/1044 - loss 2.06337325 - lr 0.0010 - time 462.97s
2024-07-29 08:36:27,117 batch 1040/1044 - loss 2.06208232 - lr 0.0010 - time 516.49s
2024-07-29 08:36:29,163 ----------------------------------------------------------------------------------------------------
2024-07-29 08:36:29,165 EPOCH 4 DONE
2024-07-29 08:36:55,198 TRAIN Loss: 2.0621
2024-07-29 08:36:55,199 DEV Loss: 3.3576
2024-07-29 08:36:55,199 DEV Perplexity: 28.7215
2024-07-29 08:36:55,199 New best score!
2024-07-29 08:36:55,200 ----------------------------------------------------------------------------------------------------
2024-07-29 08:36:55,200 EPOCH 5
2024-07-29 08:37:45,761 batch 104/1044 - loss 2.02026682 - lr 0.0010 - time 50.56s
2024-07-29 08:38:35,589 batch 208/1044 - loss 2.02837674 - lr 0.0010 - time 100.39s
2024-07-29 08:39:31,272 batch 312/1044 - loss 2.03032399 - lr 0.0010 - time 156.07s
2024-07-29 08:40:21,402 batch 416/1044 - loss 2.02753028 - lr 0.0010 - time 206.20s
2024-07-29 08:41:12,690 batch 520/1044 - loss 2.02868050 - lr 0.0010 - time 257.49s
2024-07-29 08:42:03,486 batch 624/1044 - loss 2.02747123 - lr 0.0010 - time 308.29s
2024-07-29 08:42:57,210 batch 728/1044 - loss 2.02448596 - lr 0.0010 - time 362.01s
2024-07-29 08:43:47,970 batch 832/1044 - loss 2.02472333 - lr 0.0010 - time 412.77s
2024-07-29 08:44:41,582 batch 936/1044 - loss 2.02443669 - lr 0.0010 - time 466.38s
2024-07-29 08:45:35,484 batch 1040/1044 - loss 2.02412656 - lr 0.0010 - time 520.28s
2024-07-29 08:45:37,426 ----------------------------------------------------------------------------------------------------
2024-07-29 08:45:37,428 EPOCH 5 DONE
2024-07-29 08:46:03,483 TRAIN Loss: 2.0239
2024-07-29 08:46:03,483 DEV Loss: 3.5994
2024-07-29 08:46:03,483 DEV Perplexity: 36.5764
2024-07-29 08:46:03,483 No improvement for 1 epoch(s)
2024-07-29 08:46:03,483 ----------------------------------------------------------------------------------------------------
2024-07-29 08:46:03,483 EPOCH 6
2024-07-29 08:46:55,366 batch 104/1044 - loss 2.00739912 - lr 0.0010 - time 51.88s
2024-07-29 08:47:49,416 batch 208/1044 - loss 2.01236906 - lr 0.0010 - time 105.93s
2024-07-29 08:48:40,631 batch 312/1044 - loss 2.00802403 - lr 0.0010 - time 157.15s
2024-07-29 08:49:31,859 batch 416/1044 - loss 2.00383683 - lr 0.0010 - time 208.38s
2024-07-29 08:50:24,038 batch 520/1044 - loss 2.00740076 - lr 0.0010 - time 260.55s
2024-07-29 08:51:15,147 batch 624/1044 - loss 2.00523553 - lr 0.0010 - time 311.66s
2024-07-29 08:52:08,144 batch 728/1044 - loss 2.00501477 - lr 0.0010 - time 364.66s
2024-07-29 08:53:01,139 batch 832/1044 - loss 2.00346529 - lr 0.0010 - time 417.66s
2024-07-29 08:53:52,167 batch 936/1044 - loss 2.00414460 - lr 0.0010 - time 468.68s
2024-07-29 08:54:44,524 batch 1040/1044 - loss 2.00255805 - lr 0.0010 - time 521.04s
2024-07-29 08:54:46,659 ----------------------------------------------------------------------------------------------------
2024-07-29 08:54:46,661 EPOCH 6 DONE
2024-07-29 08:55:12,782 TRAIN Loss: 2.0025
2024-07-29 08:55:12,782 DEV Loss: 3.3489
2024-07-29 08:55:12,782 DEV Perplexity: 28.4717
2024-07-29 08:55:12,782 New best score!
2024-07-29 08:55:12,783 ----------------------------------------------------------------------------------------------------
2024-07-29 08:55:12,783 EPOCH 7
2024-07-29 08:56:04,750 batch 104/1044 - loss 1.98695231 - lr 0.0010 - time 51.97s
2024-07-29 08:56:58,800 batch 208/1044 - loss 1.98767810 - lr 0.0010 - time 106.02s
2024-07-29 08:57:49,243 batch 312/1044 - loss 1.98459300 - lr 0.0010 - time 156.46s
2024-07-29 08:58:41,780 batch 416/1044 - loss 1.98503252 - lr 0.0010 - time 209.00s
2024-07-29 08:59:33,609 batch 520/1044 - loss 1.98710582 - lr 0.0010 - time 260.83s
2024-07-29 09:00:26,006 batch 624/1044 - loss 1.98528185 - lr 0.0010 - time 313.22s
2024-07-29 09:01:19,139 batch 728/1044 - loss 1.98337018 - lr 0.0010 - time 366.36s
2024-07-29 09:02:11,214 batch 832/1044 - loss 1.98256551 - lr 0.0010 - time 418.43s
2024-07-29 09:03:02,061 batch 936/1044 - loss 1.98131203 - lr 0.0010 - time 469.28s
2024-07-29 09:03:53,668 batch 1040/1044 - loss 1.97932312 - lr 0.0010 - time 520.88s
2024-07-29 09:03:55,909 ----------------------------------------------------------------------------------------------------
2024-07-29 09:03:55,910 EPOCH 7 DONE
2024-07-29 09:04:21,679 TRAIN Loss: 1.9796
2024-07-29 09:04:21,679 DEV Loss: 3.3571
2024-07-29 09:04:21,679 DEV Perplexity: 28.7050
2024-07-29 09:04:21,679 No improvement for 1 epoch(s)
2024-07-29 09:04:21,679 ----------------------------------------------------------------------------------------------------
2024-07-29 09:04:21,679 EPOCH 8
2024-07-29 09:05:13,500 batch 104/1044 - loss 1.97407123 - lr 0.0010 - time 51.82s
2024-07-29 09:06:04,321 batch 208/1044 - loss 1.96966393 - lr 0.0010 - time 102.64s
2024-07-29 09:06:55,085 batch 312/1044 - loss 1.96944196 - lr 0.0010 - time 153.41s
2024-07-29 09:07:47,563 batch 416/1044 - loss 1.96693789 - lr 0.0010 - time 205.88s
2024-07-29 09:08:40,188 batch 520/1044 - loss 1.96657811 - lr 0.0010 - time 258.51s
2024-07-29 09:09:32,010 batch 624/1044 - loss 1.96688818 - lr 0.0010 - time 310.33s
2024-07-29 09:10:22,905 batch 728/1044 - loss 1.96592610 - lr 0.0010 - time 361.23s
2024-07-29 09:11:15,842 batch 832/1044 - loss 1.96564289 - lr 0.0010 - time 414.16s
2024-07-29 09:12:07,382 batch 936/1044 - loss 1.96510702 - lr 0.0010 - time 465.70s
2024-07-29 09:13:00,098 batch 1040/1044 - loss 1.96494248 - lr 0.0010 - time 518.42s
2024-07-29 09:13:02,363 ----------------------------------------------------------------------------------------------------
2024-07-29 09:13:02,365 EPOCH 8 DONE
2024-07-29 09:13:28,243 TRAIN Loss: 1.9653
2024-07-29 09:13:28,244 DEV Loss: 3.2717
2024-07-29 09:13:28,244 DEV Perplexity: 26.3550
2024-07-29 09:13:28,244 New best score!
2024-07-29 09:13:28,245 ----------------------------------------------------------------------------------------------------
2024-07-29 09:13:28,245 EPOCH 9
2024-07-29 09:14:17,916 batch 104/1044 - loss 1.93107039 - lr 0.0010 - time 49.67s
2024-07-29 09:15:10,060 batch 208/1044 - loss 1.95099017 - lr 0.0010 - time 101.81s
2024-07-29 09:16:01,819 batch 312/1044 - loss 1.94943737 - lr 0.0010 - time 153.57s
2024-07-29 09:16:56,659 batch 416/1044 - loss 1.94723259 - lr 0.0010 - time 208.41s
2024-07-29 09:17:48,313 batch 520/1044 - loss 1.94754128 - lr 0.0010 - time 260.07s
2024-07-29 09:18:38,708 batch 624/1044 - loss 1.94901741 - lr 0.0010 - time 310.46s
2024-07-29 09:19:29,542 batch 728/1044 - loss 1.95013667 - lr 0.0010 - time 361.30s
2024-07-29 09:20:22,714 batch 832/1044 - loss 1.94866815 - lr 0.0010 - time 414.47s
2024-07-29 09:21:15,236 batch 936/1044 - loss 1.94871606 - lr 0.0010 - time 466.99s
2024-07-29 09:22:06,555 batch 1040/1044 - loss 1.94837562 - lr 0.0010 - time 518.31s
2024-07-29 09:22:08,570 ----------------------------------------------------------------------------------------------------
2024-07-29 09:22:08,572 EPOCH 9 DONE
2024-07-29 09:22:34,432 TRAIN Loss: 1.9484
2024-07-29 09:22:34,432 DEV Loss: 3.3895
2024-07-29 09:22:34,432 DEV Perplexity: 29.6497
2024-07-29 09:22:34,432 No improvement for 1 epoch(s)
2024-07-29 09:22:34,432 ----------------------------------------------------------------------------------------------------
2024-07-29 09:22:34,432 EPOCH 10
2024-07-29 09:23:25,550 batch 104/1044 - loss 1.93740847 - lr 0.0010 - time 51.12s
2024-07-29 09:24:14,975 batch 208/1044 - loss 1.94865602 - lr 0.0010 - time 100.54s
2024-07-29 09:25:08,386 batch 312/1044 - loss 1.93897269 - lr 0.0010 - time 153.95s
2024-07-29 09:26:01,085 batch 416/1044 - loss 1.93520124 - lr 0.0010 - time 206.65s
2024-07-29 09:26:53,620 batch 520/1044 - loss 1.93428783 - lr 0.0010 - time 259.19s
2024-07-29 09:27:46,957 batch 624/1044 - loss 1.93437176 - lr 0.0010 - time 312.52s
2024-07-29 09:28:39,693 batch 728/1044 - loss 1.93431406 - lr 0.0010 - time 365.26s
2024-07-29 09:29:31,536 batch 832/1044 - loss 1.93312064 - lr 0.0010 - time 417.10s
2024-07-29 09:30:23,577 batch 936/1044 - loss 1.93337018 - lr 0.0010 - time 469.14s
2024-07-29 09:31:15,446 batch 1040/1044 - loss 1.93256764 - lr 0.0010 - time 521.01s
2024-07-29 09:31:17,259 ----------------------------------------------------------------------------------------------------
2024-07-29 09:31:17,261 EPOCH 10 DONE
2024-07-29 09:31:43,257 TRAIN Loss: 1.9327
2024-07-29 09:31:43,257 DEV Loss: 3.4304
2024-07-29 09:31:43,257 DEV Perplexity: 30.8875
2024-07-29 09:31:43,257 No improvement for 2 epoch(s)
2024-07-29 09:31:43,257 ----------------------------------------------------------------------------------------------------
2024-07-29 09:31:43,257 Finished Training
2024-07-29 09:32:34,245 TEST Perplexity: 26.3855
2024-07-29 09:42:12,703 TEST BLEU = 3.28 38.5/5.9/1.0/0.5 (BP = 1.000 ratio = 1.000 hyp_len = 52 ref_len = 52)