Commit 
							
							·
						
						1339be4
	
1
								Parent(s):
							
							475c47e
								
initial
Browse files- README.md +312 -0
- added_tokens.json +1 -0
- config.json +35 -0
- optimizer.pt +3 -0
- pytorch_model.bin +3 -0
- rng_state.pth +3 -0
- runs/Apr17_09-15-38_delilah/1650208078.0177557/events.out.tfevents.1650208078.delilah.1673820.1 +3 -0
- runs/Apr17_09-15-38_delilah/events.out.tfevents.1650208078.delilah.1673820.0 +3 -0
- runs/Apr20_13-31-29_delilah/1650479523.0943172/events.out.tfevents.1650479523.delilah.16090.1 +3 -0
- runs/Apr20_13-31-29_delilah/events.out.tfevents.1650479523.delilah.16090.0 +3 -0
- runs/Apr20_13-31-46_delilah/1650479523.0845559/events.out.tfevents.1650479523.delilah.16500.1 +3 -0
- runs/Apr20_13-31-46_delilah/events.out.tfevents.1650479523.delilah.16500.0 +3 -0
- runs/Apr20_13-58-10_delilah/1650481104.9475336/events.out.tfevents.1650481104.delilah.41141.1 +3 -0
- runs/Apr20_13-58-10_delilah/events.out.tfevents.1650481104.delilah.41141.0 +3 -0
- runs/May16_17-58-59_delilah/1652743313.9479656/events.out.tfevents.1652743313.delilah.3361827.1 +3 -0
- runs/May16_17-58-59_delilah/events.out.tfevents.1652743313.delilah.3361827.0 +3 -0
- runs/May16_17-58-59_delilah/events.out.tfevents.1654218281.delilah.3361827.2 +3 -0
- scheduler.pt +3 -0
- special_tokens_map.json +1 -0
- spm.model +3 -0
- tokenizer.json +0 -0
- tokenizer_config.json +1 -0
- trainer_state.json +0 -0
- training_args.bin +3 -0
    	
        README.md
    ADDED
    
    | @@ -0,0 +1,312 @@ | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            ---
         | 
| 2 | 
            +
            tags:
         | 
| 3 | 
            +
            - generated_from_trainer
         | 
| 4 | 
            +
            model-index:
         | 
| 5 | 
            +
            - name: deberta-v3-large-ddlm
         | 
| 6 | 
            +
              results: []
         | 
| 7 | 
            +
            ---
         | 
| 8 | 
            +
             | 
| 9 | 
            +
            <!-- This model card has been generated automatically according to the information the Trainer had access to. You
         | 
| 10 | 
            +
            should probably proofread and complete it, then remove this comment. -->
         | 
| 11 | 
            +
             | 
| 12 | 
            +
            # deberta-v3-large-ddlm
         | 
| 13 | 
            +
             | 
| 14 | 
            +
            This model is a fine-tuned version of [microsoft/deberta-v3-large](https://huggingface.co/models/microsoft/deberta-v3-large) on an unknown dataset.
         | 
| 15 | 
            +
            It achieves the following results on the evaluation set:
         | 
| 16 | 
            +
            - Loss: 0.5241
         | 
| 17 | 
            +
             | 
| 18 | 
            +
            ## Model description
         | 
| 19 | 
            +
             | 
| 20 | 
            +
            More information needed
         | 
| 21 | 
            +
             | 
| 22 | 
            +
            ## Intended uses & limitations
         | 
| 23 | 
            +
             | 
| 24 | 
            +
            More information needed
         | 
| 25 | 
            +
             | 
| 26 | 
            +
            ## Training and evaluation data
         | 
| 27 | 
            +
             | 
| 28 | 
            +
            More information needed
         | 
| 29 | 
            +
             | 
| 30 | 
            +
            ## Training procedure
         | 
| 31 | 
            +
             | 
| 32 | 
            +
            ### Training hyperparameters
         | 
| 33 | 
            +
             | 
| 34 | 
            +
            The following hyperparameters were used during training:
         | 
| 35 | 
            +
            - learning_rate: 5e-05
         | 
| 36 | 
            +
            - train_batch_size: 2
         | 
| 37 | 
            +
            - eval_batch_size: 2
         | 
| 38 | 
            +
            - seed: 42
         | 
| 39 | 
            +
            - gradient_accumulation_steps: 64
         | 
| 40 | 
            +
            - total_train_batch_size: 128
         | 
| 41 | 
            +
            - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
         | 
| 42 | 
            +
            - lr_scheduler_type: linear
         | 
| 43 | 
            +
            - num_epochs: 3.0
         | 
| 44 | 
            +
             | 
| 45 | 
            +
            ### Training results
         | 
| 46 | 
            +
             | 
| 47 | 
            +
            | Training Loss | Epoch | Step   | Validation Loss |
         | 
| 48 | 
            +
            |:-------------:|:-----:|:------:|:---------------:|
         | 
| 49 | 
            +
            | 0.9823        | 0.01  | 1000   | 0.9163          |
         | 
| 50 | 
            +
            | 0.8817        | 0.02  | 2000   | 0.9022          |
         | 
| 51 | 
            +
            | 0.9647        | 0.03  | 3000   | 0.8879          |
         | 
| 52 | 
            +
            | 0.8646        | 0.04  | 4000   | 0.8577          |
         | 
| 53 | 
            +
            | 0.9159        | 0.06  | 5000   | 0.8677          |
         | 
| 54 | 
            +
            | 0.8449        | 0.07  | 6000   | 0.8221          |
         | 
| 55 | 
            +
            | 0.8681        | 0.08  | 7000   | 0.8332          |
         | 
| 56 | 
            +
            | 0.8738        | 0.09  | 8000   | 0.8334          |
         | 
| 57 | 
            +
            | 0.8638        | 0.1   | 9000   | 0.8236          |
         | 
| 58 | 
            +
            | 0.9066        | 0.11  | 10000  | 0.8200          |
         | 
| 59 | 
            +
            | 0.8686        | 0.12  | 11000  | 0.8092          |
         | 
| 60 | 
            +
            | 0.7736        | 0.13  | 12000  | 0.8199          |
         | 
| 61 | 
            +
            | 0.8054        | 0.14  | 13000  | 0.7972          |
         | 
| 62 | 
            +
            | 0.8934        | 0.16  | 14000  | 0.7998          |
         | 
| 63 | 
            +
            | 0.7884        | 0.17  | 15000  | 0.7895          |
         | 
| 64 | 
            +
            | 0.8278        | 0.18  | 16000  | 0.7586          |
         | 
| 65 | 
            +
            | 0.8482        | 0.19  | 17000  | 0.7562          |
         | 
| 66 | 
            +
            | 0.8716        | 0.2   | 18000  | 0.7819          |
         | 
| 67 | 
            +
            | 0.8881        | 0.21  | 19000  | 0.7878          |
         | 
| 68 | 
            +
            | 0.8397        | 0.22  | 20000  | 0.7989          |
         | 
| 69 | 
            +
            | 0.811         | 0.23  | 21000  | 0.7846          |
         | 
| 70 | 
            +
            | 0.7762        | 0.24  | 22000  | 0.7753          |
         | 
| 71 | 
            +
            | 0.7778        | 0.25  | 23000  | 0.7878          |
         | 
| 72 | 
            +
            | 0.737         | 0.27  | 24000  | 0.7473          |
         | 
| 73 | 
            +
            | 0.8451        | 0.28  | 25000  | 0.7460          |
         | 
| 74 | 
            +
            | 0.823         | 0.29  | 26000  | 0.7300          |
         | 
| 75 | 
            +
            | 0.7472        | 0.3   | 27000  | 0.7292          |
         | 
| 76 | 
            +
            | 0.8048        | 0.31  | 28000  | 0.7697          |
         | 
| 77 | 
            +
            | 0.7962        | 0.32  | 29000  | 0.7359          |
         | 
| 78 | 
            +
            | 0.8048        | 0.33  | 30000  | 0.7409          |
         | 
| 79 | 
            +
            | 0.8095        | 0.34  | 31000  | 0.7434          |
         | 
| 80 | 
            +
            | 0.7451        | 0.35  | 32000  | 0.7534          |
         | 
| 81 | 
            +
            | 0.6997        | 0.37  | 33000  | 0.7602          |
         | 
| 82 | 
            +
            | 0.8116        | 0.38  | 34000  | 0.7566          |
         | 
| 83 | 
            +
            | 0.7963        | 0.39  | 35000  | 0.7245          |
         | 
| 84 | 
            +
            | 0.786         | 0.4   | 36000  | 0.7311          |
         | 
| 85 | 
            +
            | 0.7991        | 0.41  | 37000  | 0.7230          |
         | 
| 86 | 
            +
            | 0.723         | 0.42  | 38000  | 0.7209          |
         | 
| 87 | 
            +
            | 0.789         | 0.43  | 39000  | 0.7418          |
         | 
| 88 | 
            +
            | 0.7296        | 0.44  | 40000  | 0.7325          |
         | 
| 89 | 
            +
            | 0.7363        | 0.45  | 41000  | 0.7134          |
         | 
| 90 | 
            +
            | 0.758         | 0.47  | 42000  | 0.6948          |
         | 
| 91 | 
            +
            | 0.711         | 0.48  | 43000  | 0.6992          |
         | 
| 92 | 
            +
            | 0.7984        | 0.49  | 44000  | 0.7055          |
         | 
| 93 | 
            +
            | 0.8402        | 0.5   | 45000  | 0.7108          |
         | 
| 94 | 
            +
            | 0.8553        | 0.51  | 46000  | 0.7005          |
         | 
| 95 | 
            +
            | 0.7538        | 0.52  | 47000  | 0.7208          |
         | 
| 96 | 
            +
            | 0.7169        | 0.53  | 48000  | 0.7291          |
         | 
| 97 | 
            +
            | 0.7345        | 0.54  | 49000  | 0.7195          |
         | 
| 98 | 
            +
            | 0.758         | 0.55  | 50000  | 0.6694          |
         | 
| 99 | 
            +
            | 0.7868        | 0.56  | 51000  | 0.6938          |
         | 
| 100 | 
            +
            | 0.6966        | 0.58  | 52000  | 0.6867          |
         | 
| 101 | 
            +
            | 0.7389        | 0.59  | 53000  | 0.6862          |
         | 
| 102 | 
            +
            | 0.7529        | 0.6   | 54000  | 0.7175          |
         | 
| 103 | 
            +
            | 0.7345        | 0.61  | 55000  | 0.6970          |
         | 
| 104 | 
            +
            | 0.766         | 0.62  | 56000  | 0.7017          |
         | 
| 105 | 
            +
            | 0.7043        | 0.63  | 57000  | 0.6916          |
         | 
| 106 | 
            +
            | 0.6474        | 0.64  | 58000  | 0.7129          |
         | 
| 107 | 
            +
            | 0.7456        | 0.65  | 59000  | 0.6802          |
         | 
| 108 | 
            +
            | 0.7512        | 0.66  | 60000  | 0.6951          |
         | 
| 109 | 
            +
            | 0.6816        | 0.68  | 61000  | 0.7072          |
         | 
| 110 | 
            +
            | 0.7206        | 0.69  | 62000  | 0.6967          |
         | 
| 111 | 
            +
            | 0.6439        | 0.7   | 63000  | 0.6798          |
         | 
| 112 | 
            +
            | 0.7309        | 0.71  | 64000  | 0.7163          |
         | 
| 113 | 
            +
            | 0.6925        | 0.72  | 65000  | 0.6794          |
         | 
| 114 | 
            +
            | 0.6833        | 0.73  | 66000  | 0.6637          |
         | 
| 115 | 
            +
            | 0.6643        | 0.74  | 67000  | 0.6855          |
         | 
| 116 | 
            +
            | 0.6433        | 0.75  | 68000  | 0.7035          |
         | 
| 117 | 
            +
            | 0.7595        | 0.76  | 69000  | 0.7008          |
         | 
| 118 | 
            +
            | 0.7214        | 0.78  | 70000  | 0.6618          |
         | 
| 119 | 
            +
            | 0.7111        | 0.79  | 71000  | 0.6850          |
         | 
| 120 | 
            +
            | 0.7375        | 0.8   | 72000  | 0.6909          |
         | 
| 121 | 
            +
            | 0.6779        | 0.81  | 73000  | 0.7042          |
         | 
| 122 | 
            +
            | 0.6646        | 0.82  | 74000  | 0.6634          |
         | 
| 123 | 
            +
            | 0.6616        | 0.83  | 75000  | 0.7020          |
         | 
| 124 | 
            +
            | 0.6762        | 0.84  | 76000  | 0.6638          |
         | 
| 125 | 
            +
            | 0.7509        | 0.85  | 77000  | 0.6541          |
         | 
| 126 | 
            +
            | 0.6963        | 0.86  | 78000  | 0.6781          |
         | 
| 127 | 
            +
            | 0.6949        | 0.87  | 79000  | 0.6576          |
         | 
| 128 | 
            +
            | 0.6781        | 0.89  | 80000  | 0.6900          |
         | 
| 129 | 
            +
            | 0.65          | 0.9   | 81000  | 0.6835          |
         | 
| 130 | 
            +
            | 0.7205        | 0.91  | 82000  | 0.6712          |
         | 
| 131 | 
            +
            | 0.6901        | 0.92  | 83000  | 0.6699          |
         | 
| 132 | 
            +
            | 0.6972        | 0.93  | 84000  | 0.6456          |
         | 
| 133 | 
            +
            | 0.7041        | 0.94  | 85000  | 0.6497          |
         | 
| 134 | 
            +
            | 0.6864        | 0.95  | 86000  | 0.6432          |
         | 
| 135 | 
            +
            | 0.7308        | 0.96  | 87000  | 0.6497          |
         | 
| 136 | 
            +
            | 0.6886        | 0.97  | 88000  | 0.6674          |
         | 
| 137 | 
            +
            | 0.6947        | 0.99  | 89000  | 0.6638          |
         | 
| 138 | 
            +
            | 0.6567        | 1.0   | 90000  | 0.6242          |
         | 
| 139 | 
            +
            | 0.7185        | 1.01  | 91000  | 0.6704          |
         | 
| 140 | 
            +
            | 0.7435        | 1.02  | 92000  | 0.6681          |
         | 
| 141 | 
            +
            | 0.7108        | 1.03  | 93000  | 0.6619          |
         | 
| 142 | 
            +
            | 0.6942        | 1.04  | 94000  | 0.6306          |
         | 
| 143 | 
            +
            | 0.6998        | 1.05  | 95000  | 0.6409          |
         | 
| 144 | 
            +
            | 0.6481        | 1.06  | 96000  | 0.6476          |
         | 
| 145 | 
            +
            | 0.727         | 1.07  | 97000  | 0.6354          |
         | 
| 146 | 
            +
            | 0.647         | 1.09  | 98000  | 0.6222          |
         | 
| 147 | 
            +
            | 0.6622        | 1.1   | 99000  | 0.6119          |
         | 
| 148 | 
            +
            | 0.6346        | 1.11  | 100000 | 0.6471          |
         | 
| 149 | 
            +
            | 0.6203        | 1.12  | 101000 | 0.6655          |
         | 
| 150 | 
            +
            | 0.6765        | 1.13  | 102000 | 0.6473          |
         | 
| 151 | 
            +
            | 0.6703        | 1.14  | 103000 | 0.6308          |
         | 
| 152 | 
            +
            | 0.6793        | 1.15  | 104000 | 0.6531          |
         | 
| 153 | 
            +
            | 0.683         | 1.16  | 105000 | 0.6693          |
         | 
| 154 | 
            +
            | 0.6654        | 1.17  | 106000 | 0.6241          |
         | 
| 155 | 
            +
            | 0.6626        | 1.18  | 107000 | 0.6215          |
         | 
| 156 | 
            +
            | 0.6976        | 1.2   | 108000 | 0.6479          |
         | 
| 157 | 
            +
            | 0.7494        | 1.21  | 109000 | 0.6345          |
         | 
| 158 | 
            +
            | 0.691         | 1.22  | 110000 | 0.6322          |
         | 
| 159 | 
            +
            | 0.6568        | 1.23  | 111000 | 0.6265          |
         | 
| 160 | 
            +
            | 0.705         | 1.24  | 112000 | 0.6281          |
         | 
| 161 | 
            +
            | 0.6307        | 1.25  | 113000 | 0.6202          |
         | 
| 162 | 
            +
            | 0.6828        | 1.26  | 114000 | 0.6158          |
         | 
| 163 | 
            +
            | 0.6403        | 1.27  | 115000 | 0.6495          |
         | 
| 164 | 
            +
            | 0.6615        | 1.28  | 116000 | 0.6298          |
         | 
| 165 | 
            +
            | 0.6237        | 1.3   | 117000 | 0.6234          |
         | 
| 166 | 
            +
            | 0.6707        | 1.31  | 118000 | 0.6267          |
         | 
| 167 | 
            +
            | 0.6823        | 1.32  | 119000 | 0.6299          |
         | 
| 168 | 
            +
            | 0.6333        | 1.33  | 120000 | 0.6169          |
         | 
| 169 | 
            +
            | 0.685         | 1.34  | 121000 | 0.6371          |
         | 
| 170 | 
            +
            | 0.6941        | 1.35  | 122000 | 0.6245          |
         | 
| 171 | 
            +
            | 0.6358        | 1.36  | 123000 | 0.6291          |
         | 
| 172 | 
            +
            | 0.6754        | 1.37  | 124000 | 0.6400          |
         | 
| 173 | 
            +
            | 0.6286        | 1.38  | 125000 | 0.6148          |
         | 
| 174 | 
            +
            | 0.7036        | 1.4   | 126000 | 0.6033          |
         | 
| 175 | 
            +
            | 0.645         | 1.41  | 127000 | 0.6295          |
         | 
| 176 | 
            +
            | 0.6578        | 1.42  | 128000 | 0.6348          |
         | 
| 177 | 
            +
            | 0.651         | 1.43  | 129000 | 0.6222          |
         | 
| 178 | 
            +
            | 0.5558        | 1.44  | 130000 | 0.6231          |
         | 
| 179 | 
            +
            | 0.6601        | 1.45  | 131000 | 0.6302          |
         | 
| 180 | 
            +
            | 0.6304        | 1.46  | 132000 | 0.6127          |
         | 
| 181 | 
            +
            | 0.6177        | 1.47  | 133000 | 0.6047          |
         | 
| 182 | 
            +
            | 0.5933        | 1.48  | 134000 | 0.6169          |
         | 
| 183 | 
            +
            | 0.6307        | 1.49  | 135000 | 0.6012          |
         | 
| 184 | 
            +
            | 0.6018        | 1.51  | 136000 | 0.5900          |
         | 
| 185 | 
            +
            | 0.6724        | 1.52  | 137000 | 0.6086          |
         | 
| 186 | 
            +
            | 0.6367        | 1.53  | 138000 | 0.6414          |
         | 
| 187 | 
            +
            | 0.6515        | 1.54  | 139000 | 0.6267          |
         | 
| 188 | 
            +
            | 0.5902        | 1.55  | 140000 | 0.5913          |
         | 
| 189 | 
            +
            | 0.6523        | 1.56  | 141000 | 0.5992          |
         | 
| 190 | 
            +
            | 0.6005        | 1.57  | 142000 | 0.6128          |
         | 
| 191 | 
            +
            | 0.6179        | 1.58  | 143000 | 0.6089          |
         | 
| 192 | 
            +
            | 0.6154        | 1.59  | 144000 | 0.6353          |
         | 
| 193 | 
            +
            | 0.6298        | 1.61  | 145000 | 0.5997          |
         | 
| 194 | 
            +
            | 0.5623        | 1.62  | 146000 | 0.5974          |
         | 
| 195 | 
            +
            | 0.5787        | 1.63  | 147000 | 0.6165          |
         | 
| 196 | 
            +
            | 0.6099        | 1.64  | 148000 | 0.6246          |
         | 
| 197 | 
            +
            | 0.658         | 1.65  | 149000 | 0.6116          |
         | 
| 198 | 
            +
            | 0.6567        | 1.66  | 150000 | 0.5938          |
         | 
| 199 | 
            +
            | 0.6227        | 1.67  | 151000 | 0.5948          |
         | 
| 200 | 
            +
            | 0.5858        | 1.68  | 152000 | 0.5822          |
         | 
| 201 | 
            +
            | 0.6227        | 1.69  | 153000 | 0.5802          |
         | 
| 202 | 
            +
            | 0.6699        | 1.71  | 154000 | 0.6067          |
         | 
| 203 | 
            +
            | 0.5989        | 1.72  | 155000 | 0.6073          |
         | 
| 204 | 
            +
            | 0.6184        | 1.73  | 156000 | 0.6124          |
         | 
| 205 | 
            +
            | 0.6404        | 1.74  | 157000 | 0.6169          |
         | 
| 206 | 
            +
            | 0.639         | 1.75  | 158000 | 0.5997          |
         | 
| 207 | 
            +
            | 0.6433        | 1.76  | 159000 | 0.5989          |
         | 
| 208 | 
            +
            | 0.5574        | 1.77  | 160000 | 0.5796          |
         | 
| 209 | 
            +
            | 0.5983        | 1.78  | 161000 | 0.6036          |
         | 
| 210 | 
            +
            | 0.6532        | 1.79  | 162000 | 0.5888          |
         | 
| 211 | 
            +
            | 0.6679        | 1.8   | 163000 | 0.6038          |
         | 
| 212 | 
            +
            | 0.62          | 1.82  | 164000 | 0.5984          |
         | 
| 213 | 
            +
            | 0.5541        | 1.83  | 165000 | 0.6003          |
         | 
| 214 | 
            +
            | 0.6192        | 1.84  | 166000 | 0.5786          |
         | 
| 215 | 
            +
            | 0.6613        | 1.85  | 167000 | 0.6064          |
         | 
| 216 | 
            +
            | 0.5923        | 1.86  | 168000 | 0.6018          |
         | 
| 217 | 
            +
            | 0.5894        | 1.87  | 169000 | 0.5912          |
         | 
| 218 | 
            +
            | 0.6462        | 1.88  | 170000 | 0.5902          |
         | 
| 219 | 
            +
            | 0.5811        | 1.89  | 171000 | 0.6030          |
         | 
| 220 | 
            +
            | 0.6358        | 1.9   | 172000 | 0.5915          |
         | 
| 221 | 
            +
            | 0.614         | 1.92  | 173000 | 0.5886          |
         | 
| 222 | 
            +
            | 0.5969        | 1.93  | 174000 | 0.6084          |
         | 
| 223 | 
            +
            | 0.6146        | 1.94  | 175000 | 0.6003          |
         | 
| 224 | 
            +
            | 0.6051        | 1.95  | 176000 | 0.5835          |
         | 
| 225 | 
            +
            | 0.6268        | 1.96  | 177000 | 0.5999          |
         | 
| 226 | 
            +
            | 0.6436        | 1.97  | 178000 | 0.5965          |
         | 
| 227 | 
            +
            | 0.6167        | 1.98  | 179000 | 0.5789          |
         | 
| 228 | 
            +
            | 0.5647        | 1.99  | 180000 | 0.5669          |
         | 
| 229 | 
            +
            | 0.6038        | 2.0   | 181000 | 0.6009          |
         | 
| 230 | 
            +
            | 0.6082        | 2.02  | 182000 | 0.5799          |
         | 
| 231 | 
            +
            | 0.6483        | 2.03  | 183000 | 0.5716          |
         | 
| 232 | 
            +
            | 0.5503        | 2.04  | 184000 | 0.5806          |
         | 
| 233 | 
            +
            | 0.6231        | 2.05  | 185000 | 0.5699          |
         | 
| 234 | 
            +
            | 0.5892        | 2.06  | 186000 | 0.5979          |
         | 
| 235 | 
            +
            | 0.5933        | 2.07  | 187000 | 0.5709          |
         | 
| 236 | 
            +
            | 0.594         | 2.08  | 188000 | 0.5719          |
         | 
| 237 | 
            +
            | 0.5838        | 2.09  | 189000 | 0.5879          |
         | 
| 238 | 
            +
            | 0.6039        | 2.1   | 190000 | 0.5984          |
         | 
| 239 | 
            +
            | 0.5911        | 2.11  | 191000 | 0.5953          |
         | 
| 240 | 
            +
            | 0.563         | 2.13  | 192000 | 0.5772          |
         | 
| 241 | 
            +
            | 0.5671        | 2.14  | 193000 | 0.5771          |
         | 
| 242 | 
            +
            | 0.6051        | 2.15  | 194000 | 0.5972          |
         | 
| 243 | 
            +
            | 0.5852        | 2.16  | 195000 | 0.5917          |
         | 
| 244 | 
            +
            | 0.5757        | 2.17  | 196000 | 0.5819          |
         | 
| 245 | 
            +
            | 0.6557        | 2.18  | 197000 | 0.5655          |
         | 
| 246 | 
            +
            | 0.6055        | 2.19  | 198000 | 0.5820          |
         | 
| 247 | 
            +
            | 0.6067        | 2.2   | 199000 | 0.5801          |
         | 
| 248 | 
            +
            | 0.6422        | 2.21  | 200000 | 0.5590          |
         | 
| 249 | 
            +
            | 0.624         | 2.23  | 201000 | 0.5573          |
         | 
| 250 | 
            +
            | 0.6222        | 2.24  | 202000 | 0.5661          |
         | 
| 251 | 
            +
            | 0.5597        | 2.25  | 203000 | 0.5786          |
         | 
| 252 | 
            +
            | 0.5746        | 2.26  | 204000 | 0.5622          |
         | 
| 253 | 
            +
            | 0.6269        | 2.27  | 205000 | 0.5804          |
         | 
| 254 | 
            +
            | 0.6241        | 2.28  | 206000 | 0.5696          |
         | 
| 255 | 
            +
            | 0.6519        | 2.29  | 207000 | 0.5367          |
         | 
| 256 | 
            +
            | 0.6161        | 2.3   | 208000 | 0.5666          |
         | 
| 257 | 
            +
            | 0.5415        | 2.31  | 209000 | 0.5633          |
         | 
| 258 | 
            +
            | 0.633         | 2.33  | 210000 | 0.5499          |
         | 
| 259 | 
            +
            | 0.5566        | 2.34  | 211000 | 0.5822          |
         | 
| 260 | 
            +
            | 0.6158        | 2.35  | 212000 | 0.5826          |
         | 
| 261 | 
            +
            | 0.5574        | 2.36  | 213000 | 0.5429          |
         | 
| 262 | 
            +
            | 0.5748        | 2.37  | 214000 | 0.5736          |
         | 
| 263 | 
            +
            | 0.5818        | 2.38  | 215000 | 0.5599          |
         | 
| 264 | 
            +
            | 0.6226        | 2.39  | 216000 | 0.5407          |
         | 
| 265 | 
            +
            | 0.5733        | 2.4   | 217000 | 0.5759          |
         | 
| 266 | 
            +
            | 0.6268        | 2.41  | 218000 | 0.5725          |
         | 
| 267 | 
            +
            | 0.5885        | 2.42  | 219000 | 0.5771          |
         | 
| 268 | 
            +
            | 0.5708        | 2.44  | 220000 | 0.5654          |
         | 
| 269 | 
            +
            | 0.5783        | 2.45  | 221000 | 0.5756          |
         | 
| 270 | 
            +
            | 0.61          | 2.46  | 222000 | 0.5647          |
         | 
| 271 | 
            +
            | 0.5848        | 2.47  | 223000 | 0.5532          |
         | 
| 272 | 
            +
            | 0.5869        | 2.48  | 224000 | 0.5519          |
         | 
| 273 | 
            +
            | 0.5717        | 2.49  | 225000 | 0.5621          |
         | 
| 274 | 
            +
            | 0.5675        | 2.5   | 226000 | 0.5446          |
         | 
| 275 | 
            +
            | 0.6321        | 2.51  | 227000 | 0.5812          |
         | 
| 276 | 
            +
            | 0.568         | 2.52  | 228000 | 0.5673          |
         | 
| 277 | 
            +
            | 0.5577        | 2.54  | 229000 | 0.5590          |
         | 
| 278 | 
            +
            | 0.5888        | 2.55  | 230000 | 0.5628          |
         | 
| 279 | 
            +
            | 0.6389        | 2.56  | 231000 | 0.5828          |
         | 
| 280 | 
            +
            | 0.5782        | 2.57  | 232000 | 0.5543          |
         | 
| 281 | 
            +
            | 0.5871        | 2.58  | 233000 | 0.5575          |
         | 
| 282 | 
            +
            | 0.5593        | 2.59  | 234000 | 0.5625          |
         | 
| 283 | 
            +
            | 0.6167        | 2.6   | 235000 | 0.5450          |
         | 
| 284 | 
            +
            | 0.5828        | 2.61  | 236000 | 0.5627          |
         | 
| 285 | 
            +
            | 0.5411        | 2.62  | 237000 | 0.5498          |
         | 
| 286 | 
            +
            | 0.6168        | 2.64  | 238000 | 0.5891          |
         | 
| 287 | 
            +
            | 0.6508        | 2.65  | 239000 | 0.5811          |
         | 
| 288 | 
            +
            | 0.6322        | 2.66  | 240000 | 0.5649          |
         | 
| 289 | 
            +
            | 0.6131        | 2.67  | 241000 | 0.5473          |
         | 
| 290 | 
            +
            | 0.5419        | 2.68  | 242000 | 0.5583          |
         | 
| 291 | 
            +
            | 0.5685        | 2.69  | 243000 | 0.5635          |
         | 
| 292 | 
            +
            | 0.5267        | 2.7   | 244000 | 0.5481          |
         | 
| 293 | 
            +
            | 0.5357        | 2.71  | 245000 | 0.5474          |
         | 
| 294 | 
            +
            | 0.585         | 2.72  | 246000 | 0.5281          |
         | 
| 295 | 
            +
            | 0.5894        | 2.73  | 247000 | 0.5457          |
         | 
| 296 | 
            +
            | 0.5665        | 2.75  | 248000 | 0.5579          |
         | 
| 297 | 
            +
            | 0.5409        | 2.76  | 249000 | 0.5412          |
         | 
| 298 | 
            +
            | 0.6051        | 2.77  | 250000 | 0.5447          |
         | 
| 299 | 
            +
            | 0.5866        | 2.78  | 251000 | 0.5535          |
         | 
| 300 | 
            +
            | 0.5348        | 2.79  | 252000 | 0.5377          |
         | 
| 301 | 
            +
            | 0.5606        | 2.8   | 253000 | 0.5524          |
         | 
| 302 | 
            +
            | 0.5142        | 2.81  | 254000 | 0.5441          |
         | 
| 303 | 
            +
            | 0.543         | 2.82  | 255000 | 0.5499          |
         | 
| 304 | 
            +
            | 0.5763        | 2.83  | 256000 | 0.5241          |
         | 
| 305 | 
            +
             | 
| 306 | 
            +
             | 
| 307 | 
            +
            ### Framework versions
         | 
| 308 | 
            +
             | 
| 309 | 
            +
            - Transformers 4.20.0.dev0
         | 
| 310 | 
            +
            - Pytorch 1.10.0+cu102
         | 
| 311 | 
            +
            - Datasets 1.15.1
         | 
| 312 | 
            +
            - Tokenizers 0.11.0
         | 
    	
        added_tokens.json
    ADDED
    
    | @@ -0,0 +1 @@ | |
|  | 
|  | |
| 1 | 
            +
            {"[MASK]": 128000}
         | 
    	
        config.json
    ADDED
    
    | @@ -0,0 +1,35 @@ | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            {
         | 
| 2 | 
            +
              "_name_or_path": "models/deberta-v3-large-ddlm/checkpoint-166000",
         | 
| 3 | 
            +
              "architectures": [
         | 
| 4 | 
            +
                "DebertaV2ForMaskedLM"
         | 
| 5 | 
            +
              ],
         | 
| 6 | 
            +
              "attention_probs_dropout_prob": 0.1,
         | 
| 7 | 
            +
              "hidden_act": "gelu",
         | 
| 8 | 
            +
              "hidden_dropout_prob": 0.1,
         | 
| 9 | 
            +
              "hidden_size": 1024,
         | 
| 10 | 
            +
              "initializer_range": 0.02,
         | 
| 11 | 
            +
              "intermediate_size": 4096,
         | 
| 12 | 
            +
              "layer_norm_eps": 1e-07,
         | 
| 13 | 
            +
              "max_position_embeddings": 512,
         | 
| 14 | 
            +
              "max_relative_positions": -1,
         | 
| 15 | 
            +
              "model_type": "deberta-v2",
         | 
| 16 | 
            +
              "norm_rel_ebd": "layer_norm",
         | 
| 17 | 
            +
              "num_attention_heads": 16,
         | 
| 18 | 
            +
              "num_hidden_layers": 24,
         | 
| 19 | 
            +
              "pad_token_id": 0,
         | 
| 20 | 
            +
              "pooler_dropout": 0,
         | 
| 21 | 
            +
              "pooler_hidden_act": "gelu",
         | 
| 22 | 
            +
              "pooler_hidden_size": 1024,
         | 
| 23 | 
            +
              "pos_att_type": [
         | 
| 24 | 
            +
                "p2c",
         | 
| 25 | 
            +
                "c2p"
         | 
| 26 | 
            +
              ],
         | 
| 27 | 
            +
              "position_biased_input": false,
         | 
| 28 | 
            +
              "position_buckets": 256,
         | 
| 29 | 
            +
              "relative_attention": true,
         | 
| 30 | 
            +
              "share_att_key": true,
         | 
| 31 | 
            +
              "torch_dtype": "float32",
         | 
| 32 | 
            +
              "transformers_version": "4.20.0.dev0",
         | 
| 33 | 
            +
              "type_vocab_size": 0,
         | 
| 34 | 
            +
              "vocab_size": 128001
         | 
| 35 | 
            +
            }
         | 
    	
        optimizer.pt
    ADDED
    
    | @@ -0,0 +1,3 @@ | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            version https://git-lfs.github.com/spec/v1
         | 
| 2 | 
            +
            oid sha256:cb5c9545a39c622eb7770b22f2f30dd4e7746a7b8830ff6c26f4c11c1c127ff5
         | 
| 3 | 
            +
            size 3480954133
         | 
    	
        pytorch_model.bin
    ADDED
    
    | @@ -0,0 +1,3 @@ | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            version https://git-lfs.github.com/spec/v1
         | 
| 2 | 
            +
            oid sha256:795d781ad75b4e0cdf0460f9855fe1b6051bf1f9f26a7dbce559a1903046947e
         | 
| 3 | 
            +
            size 1740500457
         | 
    	
        rng_state.pth
    ADDED
    
    | @@ -0,0 +1,3 @@ | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            version https://git-lfs.github.com/spec/v1
         | 
| 2 | 
            +
            oid sha256:15ec683b1b8f565d2c86daaf7d598a122c80379188098c3bc86654bd7fc07f39
         | 
| 3 | 
            +
            size 14503
         | 
    	
        runs/Apr17_09-15-38_delilah/1650208078.0177557/events.out.tfevents.1650208078.delilah.1673820.1
    ADDED
    
    | @@ -0,0 +1,3 @@ | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            version https://git-lfs.github.com/spec/v1
         | 
| 2 | 
            +
            oid sha256:5acf8cd13491f0ed20162bf4404758f352d613dfcfe4447dc72820cc006a01ed
         | 
| 3 | 
            +
            size 4791
         | 
    	
        runs/Apr17_09-15-38_delilah/events.out.tfevents.1650208078.delilah.1673820.0
    ADDED
    
    | @@ -0,0 +1,3 @@ | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            version https://git-lfs.github.com/spec/v1
         | 
| 2 | 
            +
            oid sha256:07ee7c191ce05ae72a7d6ae830e0a6bd3860c1cb0a020bff24fd3d6327654401
         | 
| 3 | 
            +
            size 601125
         | 
    	
        runs/Apr20_13-31-29_delilah/1650479523.0943172/events.out.tfevents.1650479523.delilah.16090.1
    ADDED
    
    | @@ -0,0 +1,3 @@ | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            version https://git-lfs.github.com/spec/v1
         | 
| 2 | 
            +
            oid sha256:f791601703ce491f7685f39c3dc5564adfecfc020579d9250e26999d3dc23ed2
         | 
| 3 | 
            +
            size 4791
         | 
    	
        runs/Apr20_13-31-29_delilah/events.out.tfevents.1650479523.delilah.16090.0
    ADDED
    
    | @@ -0,0 +1,3 @@ | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            version https://git-lfs.github.com/spec/v1
         | 
| 2 | 
            +
            oid sha256:75b12fa88477bad43161335e4edb4ad5ac41f0a7d4e57a663f61c3e7146f7d1b
         | 
| 3 | 
            +
            size 4220
         | 
    	
        runs/Apr20_13-31-46_delilah/1650479523.0845559/events.out.tfevents.1650479523.delilah.16500.1
    ADDED
    
    | @@ -0,0 +1,3 @@ | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            version https://git-lfs.github.com/spec/v1
         | 
| 2 | 
            +
            oid sha256:248aeecb400818e702626b340712c44651293826cb947fb2e8a75fbdbbaf1bc5
         | 
| 3 | 
            +
            size 4791
         | 
    	
        runs/Apr20_13-31-46_delilah/events.out.tfevents.1650479523.delilah.16500.0
    ADDED
    
    | @@ -0,0 +1,3 @@ | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            version https://git-lfs.github.com/spec/v1
         | 
| 2 | 
            +
            oid sha256:9d98b066d935f812dc8d6298e4b1b508a25e869643054b7f0c17ceb65160eb9b
         | 
| 3 | 
            +
            size 3900
         | 
    	
        runs/Apr20_13-58-10_delilah/1650481104.9475336/events.out.tfevents.1650481104.delilah.41141.1
    ADDED
    
    | @@ -0,0 +1,3 @@ | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            version https://git-lfs.github.com/spec/v1
         | 
| 2 | 
            +
            oid sha256:4cb93a141008934a8adffc73e6fd45b81083950055cd934960db19b34de70f97
         | 
| 3 | 
            +
            size 4791
         | 
    	
        runs/Apr20_13-58-10_delilah/events.out.tfevents.1650481104.delilah.41141.0
    ADDED
    
    | @@ -0,0 +1,3 @@ | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            version https://git-lfs.github.com/spec/v1
         | 
| 2 | 
            +
            oid sha256:4583a3dd51d5845a23af3e68760051d741bcd9cd14766bc9128f72b3a163d6dc
         | 
| 3 | 
            +
            size 5193980
         | 
    	
        runs/May16_17-58-59_delilah/1652743313.9479656/events.out.tfevents.1652743313.delilah.3361827.1
    ADDED
    
    | @@ -0,0 +1,3 @@ | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            version https://git-lfs.github.com/spec/v1
         | 
| 2 | 
            +
            oid sha256:ea40c86b02446fcdec0d255886e72e0efeed49df30883f894edc174057adabad
         | 
| 3 | 
            +
            size 5233
         | 
    	
        runs/May16_17-58-59_delilah/events.out.tfevents.1652743313.delilah.3361827.0
    ADDED
    
    | @@ -0,0 +1,3 @@ | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            version https://git-lfs.github.com/spec/v1
         | 
| 2 | 
            +
            oid sha256:34793cf7505bdec738691a718eca33cbb3efae994af5b598c0db9304ae2651a5
         | 
| 3 | 
            +
            size 3390819
         | 
    	
        runs/May16_17-58-59_delilah/events.out.tfevents.1654218281.delilah.3361827.2
    ADDED
    
    | @@ -0,0 +1,3 @@ | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            version https://git-lfs.github.com/spec/v1
         | 
| 2 | 
            +
            oid sha256:b546a06c5bac92cef71cf7d0ca65735b321d9535ffa1f80cf81764762b36acf4
         | 
| 3 | 
            +
            size 316
         | 
    	
        scheduler.pt
    ADDED
    
    | @@ -0,0 +1,3 @@ | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            version https://git-lfs.github.com/spec/v1
         | 
| 2 | 
            +
            oid sha256:afc63963ca5afabe2e74f09c32eb2f23b8b37b6a3ff51e49db8094b3218c95bb
         | 
| 3 | 
            +
            size 623
         | 
    	
        special_tokens_map.json
    ADDED
    
    | @@ -0,0 +1 @@ | |
|  | 
|  | |
| 1 | 
            +
            {"bos_token": "[CLS]", "eos_token": "[SEP]", "unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]"}
         | 
    	
        spm.model
    ADDED
    
    | @@ -0,0 +1,3 @@ | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            version https://git-lfs.github.com/spec/v1
         | 
| 2 | 
            +
            oid sha256:c679fbf93643d19aab7ee10c0b99e460bdbc02fedf34b92b05af343b4af586fd
         | 
| 3 | 
            +
            size 2464616
         | 
    	
        tokenizer.json
    ADDED
    
    | The diff for this file is too large to render. 
		See raw diff | 
|  | 
    	
        tokenizer_config.json
    ADDED
    
    | @@ -0,0 +1 @@ | |
|  | 
|  | |
| 1 | 
            +
            {"do_lower_case": false, "bos_token": "[CLS]", "eos_token": "[SEP]", "unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]", "split_by_punct": false, "sp_model_kwargs": {}, "vocab_type": "spm", "special_tokens_map_file": null, "name_or_path": "models/deberta-v3-large-ddlm/checkpoint-166000", "tokenizer_class": "DebertaV2Tokenizer"}
         | 
    	
        trainer_state.json
    ADDED
    
    | The diff for this file is too large to render. 
		See raw diff | 
|  | 
    	
        training_args.bin
    ADDED
    
    | @@ -0,0 +1,3 @@ | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            version https://git-lfs.github.com/spec/v1
         | 
| 2 | 
            +
            oid sha256:4b9874199dc5f46d210c0e39f0b3dda226896ad2a17cfa13b24e5fa6683a92be
         | 
| 3 | 
            +
            size 3247
         |