update model card README.md
Browse files
    	
        README.md
    CHANGED
    
    | 
         @@ -15,9 +15,9 @@ should probably proofread and complete it, then remove this comment. --> 
     | 
|
| 15 | 
         | 
| 16 | 
         
             
            This model was trained from scratch on the None dataset.
         
     | 
| 17 | 
         
             
            It achieves the following results on the evaluation set:
         
     | 
| 18 | 
         
            -
            - Loss: 0. 
     | 
| 19 | 
         
            -
            - Bleu:  
     | 
| 20 | 
         
            -
            - Gen Len: 17. 
     | 
| 21 | 
         | 
| 22 | 
         
             
            ## Model description
         
     | 
| 23 | 
         | 
| 
         @@ -36,12 +36,12 @@ More information needed 
     | 
|
| 36 | 
         
             
            ### Training hyperparameters
         
     | 
| 37 | 
         | 
| 38 | 
         
             
            The following hyperparameters were used during training:
         
     | 
| 39 | 
         
            -
            - learning_rate: 0. 
     | 
| 40 | 
         
            -
            - train_batch_size:  
     | 
| 41 | 
         
            -
            - eval_batch_size:  
     | 
| 42 | 
         
             
            - seed: 42
         
     | 
| 43 | 
         
             
            - gradient_accumulation_steps: 10
         
     | 
| 44 | 
         
            -
            - total_train_batch_size:  
     | 
| 45 | 
         
             
            - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
         
     | 
| 46 | 
         
             
            - lr_scheduler_type: linear
         
     | 
| 47 | 
         
             
            - num_epochs: 5
         
     | 
| 
         @@ -50,11 +50,11 @@ The following hyperparameters were used during training: 
     | 
|
| 50 | 
         | 
| 51 | 
         
             
            | Training Loss | Epoch | Step  | Validation Loss | Bleu    | Gen Len |
         
     | 
| 52 | 
         
             
            |:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|
         
     | 
| 53 | 
         
            -
            |  
     | 
| 54 | 
         
            -
            |  
     | 
| 55 | 
         
            -
            | 0. 
     | 
| 56 | 
         
            -
            | 0. 
     | 
| 57 | 
         
            -
            | 0. 
     | 
| 58 | 
         | 
| 59 | 
         | 
| 60 | 
         
             
            ### Framework versions
         
     | 
| 
         | 
|
| 15 | 
         | 
| 16 | 
         
             
            This model was trained from scratch on the None dataset.
         
     | 
| 17 | 
         
             
            It achieves the following results on the evaluation set:
         
     | 
| 18 | 
         
            +
            - Loss: 0.7716
         
     | 
| 19 | 
         
            +
            - Bleu: 13.1062
         
     | 
| 20 | 
         
            +
            - Gen Len: 17.8687
         
     | 
| 21 | 
         | 
| 22 | 
         
             
            ## Model description
         
     | 
| 23 | 
         | 
| 
         | 
|
| 36 | 
         
             
            ### Training hyperparameters
         
     | 
| 37 | 
         | 
| 38 | 
         
             
            The following hyperparameters were used during training:
         
     | 
| 39 | 
         
            +
            - learning_rate: 0.0002
         
     | 
| 40 | 
         
            +
            - train_batch_size: 16
         
     | 
| 41 | 
         
            +
            - eval_batch_size: 4
         
     | 
| 42 | 
         
             
            - seed: 42
         
     | 
| 43 | 
         
             
            - gradient_accumulation_steps: 10
         
     | 
| 44 | 
         
            +
            - total_train_batch_size: 160
         
     | 
| 45 | 
         
             
            - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
         
     | 
| 46 | 
         
             
            - lr_scheduler_type: linear
         
     | 
| 47 | 
         
             
            - num_epochs: 5
         
     | 
| 
         | 
|
| 50 | 
         | 
| 51 | 
         
             
            | Training Loss | Epoch | Step  | Validation Loss | Bleu    | Gen Len |
         
     | 
| 52 | 
         
             
            |:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|
         
     | 
| 53 | 
         
            +
            | 0.856         | 1.0   | 9641  | 0.8368          | 12.1924 | 17.8903 |
         
     | 
| 54 | 
         
            +
            | 0.8281        | 2.0   | 19282 | 0.8107          | 12.5703 | 17.8566 |
         
     | 
| 55 | 
         
            +
            | 0.8017        | 3.0   | 28923 | 0.7904          | 12.7893 | 17.8793 |
         
     | 
| 56 | 
         
            +
            | 0.7788        | 4.0   | 38564 | 0.7779          | 13.0086 | 17.8712 |
         
     | 
| 57 | 
         
            +
            | 0.7673        | 5.0   | 48205 | 0.7716          | 13.1062 | 17.8687 |
         
     | 
| 58 | 
         | 
| 59 | 
         | 
| 60 | 
         
             
            ### Framework versions
         
     |