kazandaev commited on
Commit
c6d1fb7
·
1 Parent(s): 150ea98

update model card README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -12
README.md CHANGED
@@ -15,9 +15,9 @@ should probably proofread and complete it, then remove this comment. -->
15
 
16
  This model was trained from scratch on the None dataset.
17
  It achieves the following results on the evaluation set:
18
- - Loss: 0.8401
19
- - Bleu: 12.2867
20
- - Gen Len: 17.8712
21
 
22
  ## Model description
23
 
@@ -36,12 +36,12 @@ More information needed
36
  ### Training hyperparameters
37
 
38
  The following hyperparameters were used during training:
39
- - learning_rate: 0.0004
40
- - train_batch_size: 14
41
- - eval_batch_size: 3
42
  - seed: 42
43
  - gradient_accumulation_steps: 10
44
- - total_train_batch_size: 140
45
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
46
  - lr_scheduler_type: linear
47
  - num_epochs: 5
@@ -50,11 +50,11 @@ The following hyperparameters were used during training:
50
 
51
  | Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
52
  |:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|
53
- | 1.1469 | 1.0 | 11019 | 1.0554 | 10.1262 | 17.8974 |
54
- | 1.0112 | 2.0 | 22038 | 0.9529 | 10.9674 | 17.8698 |
55
- | 0.937 | 3.0 | 33057 | 0.8913 | 11.6301 | 17.8687 |
56
- | 0.8809 | 4.0 | 44076 | 0.8545 | 11.9517 | 17.8833 |
57
- | 0.8501 | 5.0 | 55095 | 0.8401 | 12.2867 | 17.8712 |
58
 
59
 
60
  ### Framework versions
 
15
 
16
  This model was trained from scratch on the None dataset.
17
  It achieves the following results on the evaluation set:
18
+ - Loss: 0.7716
19
+ - Bleu: 13.1062
20
+ - Gen Len: 17.8687
21
 
22
  ## Model description
23
 
 
36
  ### Training hyperparameters
37
 
38
  The following hyperparameters were used during training:
39
+ - learning_rate: 0.0002
40
+ - train_batch_size: 16
41
+ - eval_batch_size: 4
42
  - seed: 42
43
  - gradient_accumulation_steps: 10
44
+ - total_train_batch_size: 160
45
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
46
  - lr_scheduler_type: linear
47
  - num_epochs: 5
 
50
 
51
  | Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
52
  |:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|
53
+ | 0.856 | 1.0 | 9641 | 0.8368 | 12.1924 | 17.8903 |
54
+ | 0.8281 | 2.0 | 19282 | 0.8107 | 12.5703 | 17.8566 |
55
+ | 0.8017 | 3.0 | 28923 | 0.7904 | 12.7893 | 17.8793 |
56
+ | 0.7788 | 4.0 | 38564 | 0.7779 | 13.0086 | 17.8712 |
57
+ | 0.7673 | 5.0 | 48205 | 0.7716 | 13.1062 | 17.8687 |
58
 
59
 
60
  ### Framework versions