rootxhacker's picture
Training checkpoint 554000 - 5.7B tokens
ea7303f verified
---
language: en
license: mit
tags:
- spiking-neural-networks
- language-modeling
- neuromorphic
- energy-efficient
- biological-ai
datasets:
- fineweb-5B
pipeline_tag: text-generation
---
# 🧠 Spiking Neural Network Language Model - Training Checkpoint
**Live training checkpoint from the world's first large-scale spiking language model!**
## Current Training Status
- **Training Step**: 554,000
- **Tokens Processed**: 5.67B tokens
- **Current Loss**: 4.5783
- **Spike Rate**: 0.0508
- **Learning Rate**: 8.15e-05
## Model Architecture
- **Parameters**: ~54M
- **Architecture**: 12-layer Spiking LTC Network
- **Hidden Size**: 768
- **Sequence Length**: 1024
- **Multi-timescale Processing**: Fast β†’ Medium β†’ Slow layers
## Training Details
- **Dataset**: PatrickHaller/fineweb-5B
- **Target**: 3 epochs (~15B tokens total)
- **Biological Dynamics**: Adaptive thresholds, refractory periods
- **Energy Efficiency**: ~5% neuron activation vs 100% in Transformers
## Scientific Significance
This represents ongoing training of the first large-scale spiking neural network for language modeling, demonstrating:
1. **Biological neural dynamics** can learn language at scale
2. **Energy efficiency** through sparse neural firing
3. **Multi-timescale processing** for hierarchical understanding
## Usage
```python
# Download this checkpoint
from huggingface_hub import hf_hub_download
checkpoint = hf_hub_download(
repo_id="rootxhacker/piking-llm-5b-3epochs-exp",
filename="checkpoint_554000.pt"
)
# Load with custom spiking model code
# (See full implementation for complete usage)
```
---
**πŸ”¬ This is live research in progress! Check back for updates as training continues.**
**Training Progress**: 37.8% complete towards 15B tokens