Update README.md
Browse files
    	
        README.md
    CHANGED
    
    | 
         @@ -2,4 +2,6 @@ 
     | 
|
| 2 | 
         | 
| 3 | 
         
             
            This model has a word2vec token embedding matrix with 256k entries. The word2vec was trained on 100GB data from C4, MSMARCO, News, Wikipedia, S2ORC, for 3 epochs.
         
     | 
| 4 | 
         | 
| 5 | 
         
            -
            Then the model was trained on this dataset with MLM for  
     | 
| 
         | 
|
| 
         | 
| 
         | 
|
| 2 | 
         | 
| 3 | 
         
             
            This model has a word2vec token embedding matrix with 256k entries. The word2vec was trained on 100GB data from C4, MSMARCO, News, Wikipedia, S2ORC, for 3 epochs.
         
     | 
| 4 | 
         | 
| 5 | 
         
            +
            Then the model was trained on this dataset with MLM for 1.37M steps (batch size 64). The token embeddings were NOT updated.
         
     | 
| 6 | 
         
            +
             
     | 
| 7 | 
         
            +
            For the initial word2vec weights with Gensim see: [https://huggingface.co/vocab-transformers/distilbert-word2vec_256k-MLM_1M/tree/main/word2vec](https://huggingface.co/vocab-transformers/distilbert-word2vec_256k-MLM_1M/tree/main/word2vec)
         
     |