Training checkpoint 24000 - 0.2B tokens
Browse files- README.md +7 -7
- checkpoint_24000.pt +3 -0
- config.json +7 -7
    	
        README.md
    CHANGED
    
    | @@ -18,11 +18,11 @@ pipeline_tag: text-generation | |
| 18 |  | 
| 19 | 
             
            ## Current Training Status
         | 
| 20 |  | 
| 21 | 
            -
            - **Training Step**:  | 
| 22 | 
            -
            - **Tokens Processed**: 0. | 
| 23 | 
            -
            - **Current Loss**: 4. | 
| 24 | 
            -
            - **Spike Rate**: 0. | 
| 25 | 
            -
            - **Learning Rate**: 1. | 
| 26 |  | 
| 27 | 
             
            ## Model Architecture
         | 
| 28 |  | 
| @@ -54,7 +54,7 @@ This represents ongoing training of the first large-scale spiking neural network | |
| 54 | 
             
            from huggingface_hub import hf_hub_download
         | 
| 55 | 
             
            checkpoint = hf_hub_download(
         | 
| 56 | 
             
                repo_id="rootxhacker/piking-llm-5b-3epochs-exp",
         | 
| 57 | 
            -
                filename=" | 
| 58 | 
             
            )
         | 
| 59 |  | 
| 60 | 
             
            # Load with custom spiking model code
         | 
| @@ -65,4 +65,4 @@ checkpoint = hf_hub_download( | |
| 65 |  | 
| 66 | 
             
            **🔬 This is live research in progress! Check back for updates as training continues.**
         | 
| 67 |  | 
| 68 | 
            -
            **Training Progress**: 1. | 
|  | |
| 18 |  | 
| 19 | 
             
            ## Current Training Status
         | 
| 20 |  | 
| 21 | 
            +
            - **Training Step**: 24,000
         | 
| 22 | 
            +
            - **Tokens Processed**: 0.25B tokens
         | 
| 23 | 
            +
            - **Current Loss**: 4.7107
         | 
| 24 | 
            +
            - **Spike Rate**: 0.0134
         | 
| 25 | 
            +
            - **Learning Rate**: 1.85e-04
         | 
| 26 |  | 
| 27 | 
             
            ## Model Architecture
         | 
| 28 |  | 
|  | |
| 54 | 
             
            from huggingface_hub import hf_hub_download
         | 
| 55 | 
             
            checkpoint = hf_hub_download(
         | 
| 56 | 
             
                repo_id="rootxhacker/piking-llm-5b-3epochs-exp",
         | 
| 57 | 
            +
                filename="checkpoint_24000.pt"
         | 
| 58 | 
             
            )
         | 
| 59 |  | 
| 60 | 
             
            # Load with custom spiking model code
         | 
|  | |
| 65 |  | 
| 66 | 
             
            **🔬 This is live research in progress! Check back for updates as training continues.**
         | 
| 67 |  | 
| 68 | 
            +
            **Training Progress**: 1.6% complete towards 15B tokens
         | 
    	
        checkpoint_24000.pt
    ADDED
    
    | @@ -0,0 +1,3 @@ | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            version https://git-lfs.github.com/spec/v1
         | 
| 2 | 
            +
            oid sha256:77d7b10c161cdce049b5ff5d11e24a9f2d197fb81da98791f6a86f7937b30a01
         | 
| 3 | 
            +
            size 999026492
         | 
    	
        config.json
    CHANGED
    
    | @@ -4,11 +4,11 @@ | |
| 4 | 
             
              "hidden_size": 768,
         | 
| 5 | 
             
              "num_layers": 12,
         | 
| 6 | 
             
              "max_seq_length": 1024,
         | 
| 7 | 
            -
              "training_step":  | 
| 8 | 
            -
              "tokens_processed":  | 
| 9 | 
            -
              "loss": 4. | 
| 10 | 
            -
              "spike_rate": 0. | 
| 11 | 
            -
              "learning_rate": 0. | 
| 12 | 
            -
              "epoch": 0. | 
| 13 | 
            -
              "progress_percent": 1. | 
| 14 | 
             
            }
         | 
|  | |
| 4 | 
             
              "hidden_size": 768,
         | 
| 5 | 
             
              "num_layers": 12,
         | 
| 6 | 
             
              "max_seq_length": 1024,
         | 
| 7 | 
            +
              "training_step": 24000,
         | 
| 8 | 
            +
              "tokens_processed": 245760000,
         | 
| 9 | 
            +
              "loss": 4.710656542452263,
         | 
| 10 | 
            +
              "spike_rate": 0.013364357058069197,
         | 
| 11 | 
            +
              "learning_rate": 0.00018493535648803967,
         | 
| 12 | 
            +
              "epoch": 0.049152,
         | 
| 13 | 
            +
              "progress_percent": 1.6384041943147374
         | 
| 14 | 
             
            }
         |