kazandaev
/

mt5-base-en-ru

text2text-generation

Generated from Trainer

Model card Files Files and versions

Metrics Training metrics Community

kazandaev commited on Mar 13, 2022

Commit

c6d1fb7

·

1 Parent(s): 150ea98

update model card README.md

Files changed (1) hide show

README.md +12 -12

README.md CHANGED Viewed

@@ -15,9 +15,9 @@ should probably proofread and complete it, then remove this comment. -->
 This model was trained from scratch on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.8401
-- Bleu: 12.2867
-- Gen Len: 17.8712
 ## Model description
@@ -36,12 +36,12 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 0.0004
-- train_batch_size: 14
-- eval_batch_size: 3
 - seed: 42
 - gradient_accumulation_steps: 10
-- total_train_batch_size: 140
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 5
@@ -50,11 +50,11 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step  | Validation Loss | Bleu    | Gen Len |
 |:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|
-| 1.1469        | 1.0   | 11019 | 1.0554          | 10.1262 | 17.8974 |
-| 1.0112        | 2.0   | 22038 | 0.9529          | 10.9674 | 17.8698 |
-| 0.937         | 3.0   | 33057 | 0.8913          | 11.6301 | 17.8687 |
-| 0.8809        | 4.0   | 44076 | 0.8545          | 11.9517 | 17.8833 |
-| 0.8501        | 5.0   | 55095 | 0.8401          | 12.2867 | 17.8712 |
 ### Framework versions

 This model was trained from scratch on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.7716
+- Bleu: 13.1062
+- Gen Len: 17.8687
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 0.0002
+- train_batch_size: 16
+- eval_batch_size: 4
 - seed: 42
 - gradient_accumulation_steps: 10
+- total_train_batch_size: 160
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 5
 | Training Loss | Epoch | Step  | Validation Loss | Bleu    | Gen Len |
 |:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|
+| 0.856         | 1.0   | 9641  | 0.8368          | 12.1924 | 17.8903 |
+| 0.8281        | 2.0   | 19282 | 0.8107          | 12.5703 | 17.8566 |
+| 0.8017        | 3.0   | 28923 | 0.7904          | 12.7893 | 17.8793 |
+| 0.7788        | 4.0   | 38564 | 0.7779          | 13.0086 | 17.8712 |
+| 0.7673        | 5.0   | 48205 | 0.7716          | 13.1062 | 17.8687 |
 ### Framework versions