metadata

language: en
license: mit
tags:
  - spiking-neural-networks
  - language-modeling
  - neuromorphic
  - energy-efficient
  - biological-ai
datasets:
  - fineweb-5B
pipeline_tag: text-generation

🧠 Spiking Neural Network Language Model - Training Checkpoint

Live training checkpoint from the world's first large-scale spiking language model!

Current Training Status

Training Step: 554,000
Tokens Processed: 5.67B tokens
Current Loss: 4.5783
Spike Rate: 0.0508
Learning Rate: 8.15e-05

Model Architecture

Parameters: ~54M
Architecture: 12-layer Spiking LTC Network
Hidden Size: 768
Sequence Length: 1024
Multi-timescale Processing: Fast → Medium → Slow layers

Training Details

Dataset: PatrickHaller/fineweb-5B
Target: 3 epochs (~15B tokens total)
Biological Dynamics: Adaptive thresholds, refractory periods
Energy Efficiency: ~5% neuron activation vs 100% in Transformers

Scientific Significance

This represents ongoing training of the first large-scale spiking neural network for language modeling, demonstrating:

Biological neural dynamics can learn language at scale
Energy efficiency through sparse neural firing
Multi-timescale processing for hierarchical understanding

Usage

# Download this checkpoint
from huggingface_hub import hf_hub_download
checkpoint = hf_hub_download(
    repo_id="rootxhacker/piking-llm-5b-3epochs-exp",
    filename="checkpoint_554000.pt"
)

# Load with custom spiking model code
# (See full implementation for complete usage)

🔬 This is live research in progress! Check back for updates as training continues.

Training Progress: 37.8% complete towards 15B tokens