--- language: en license: mit tags: - spiking-neural-networks - language-modeling - neuromorphic - energy-efficient - biological-ai datasets: - fineweb-5B pipeline_tag: text-generation --- # 🧠 Spiking Neural Network Language Model - Training Checkpoint **Live training checkpoint from the world's first large-scale spiking language model!** ## Current Training Status - **Training Step**: 554,000 - **Tokens Processed**: 5.67B tokens - **Current Loss**: 4.5783 - **Spike Rate**: 0.0508 - **Learning Rate**: 8.15e-05 ## Model Architecture - **Parameters**: ~54M - **Architecture**: 12-layer Spiking LTC Network - **Hidden Size**: 768 - **Sequence Length**: 1024 - **Multi-timescale Processing**: Fast → Medium → Slow layers ## Training Details - **Dataset**: PatrickHaller/fineweb-5B - **Target**: 3 epochs (~15B tokens total) - **Biological Dynamics**: Adaptive thresholds, refractory periods - **Energy Efficiency**: ~5% neuron activation vs 100% in Transformers ## Scientific Significance This represents ongoing training of the first large-scale spiking neural network for language modeling, demonstrating: 1. **Biological neural dynamics** can learn language at scale 2. **Energy efficiency** through sparse neural firing 3. **Multi-timescale processing** for hierarchical understanding ## Usage ```python # Download this checkpoint from huggingface_hub import hf_hub_download checkpoint = hf_hub_download( repo_id="rootxhacker/piking-llm-5b-3epochs-exp", filename="checkpoint_554000.pt" ) # Load with custom spiking model code # (See full implementation for complete usage) ``` --- **🔬 This is live research in progress! Check back for updates as training continues.** **Training Progress**: 37.8% complete towards 15B tokens