rootxhacker's picture
Training checkpoint 554000 - 5.7B tokens
ea7303f verified
metadata
language: en
license: mit
tags:
  - spiking-neural-networks
  - language-modeling
  - neuromorphic
  - energy-efficient
  - biological-ai
datasets:
  - fineweb-5B
pipeline_tag: text-generation

🧠 Spiking Neural Network Language Model - Training Checkpoint

Live training checkpoint from the world's first large-scale spiking language model!

Current Training Status

  • Training Step: 554,000
  • Tokens Processed: 5.67B tokens
  • Current Loss: 4.5783
  • Spike Rate: 0.0508
  • Learning Rate: 8.15e-05

Model Architecture

  • Parameters: ~54M
  • Architecture: 12-layer Spiking LTC Network
  • Hidden Size: 768
  • Sequence Length: 1024
  • Multi-timescale Processing: Fast → Medium → Slow layers

Training Details

  • Dataset: PatrickHaller/fineweb-5B
  • Target: 3 epochs (~15B tokens total)
  • Biological Dynamics: Adaptive thresholds, refractory periods
  • Energy Efficiency: ~5% neuron activation vs 100% in Transformers

Scientific Significance

This represents ongoing training of the first large-scale spiking neural network for language modeling, demonstrating:

  1. Biological neural dynamics can learn language at scale
  2. Energy efficiency through sparse neural firing
  3. Multi-timescale processing for hierarchical understanding

Usage

# Download this checkpoint
from huggingface_hub import hf_hub_download
checkpoint = hf_hub_download(
    repo_id="rootxhacker/piking-llm-5b-3epochs-exp",
    filename="checkpoint_554000.pt"
)

# Load with custom spiking model code
# (See full implementation for complete usage)

🔬 This is live research in progress! Check back for updates as training continues.

Training Progress: 37.8% complete towards 15B tokens