Update README.md
Browse files
    	
        README.md
    CHANGED
    
    | 
         @@ -8,6 +8,9 @@ license: mit 
     | 
|
| 8 | 
         | 
| 9 | 
         
             
            By Fernando, Eric and David
         
     | 
| 10 | 
         | 
| 
         | 
|
| 
         | 
|
| 
         | 
|
| 11 | 
         
             
            This is a hack around pytorch + huggingface Transformers library to make the original Dolphin Phi-2 to behave in a way inspired by the Meta's paper "MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases" [ https://arxiv.org/abs/2402.14905 ]
         
     | 
| 12 | 
         | 
| 13 | 
         
             
            One of the key ideas is that it works as if it was like "an online passthrough", by applying a loop on a module SuperClass, that groups layers, in a such way they get their forward method repeated in a loop.
         
     | 
| 
         | 
|
| 8 | 
         | 
| 9 | 
         
             
            By Fernando, Eric and David
         
     | 
| 10 | 
         | 
| 11 | 
         
            +
            [](https://discord.gg/cognitivecomputations)
         
     | 
| 12 | 
         
            +
            Discord: https://discord.gg/cognitivecomputations
         
     | 
| 13 | 
         
            +
             
     | 
| 14 | 
         
             
            This is a hack around pytorch + huggingface Transformers library to make the original Dolphin Phi-2 to behave in a way inspired by the Meta's paper "MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases" [ https://arxiv.org/abs/2402.14905 ]
         
     | 
| 15 | 
         | 
| 16 | 
         
             
            One of the key ideas is that it works as if it was like "an online passthrough", by applying a loop on a module SuperClass, that groups layers, in a such way they get their forward method repeated in a loop.
         
     |