rootxhacker
/

piking-llm-5b-3epochs-exp

Text Generation

spiking-neural-networks

language-modeling

energy-efficient

Model card Files Files and versions

piking-llm-5b-3epochs-exp / README.md

rootxhacker's picture

Training checkpoint 554000 - 5.7B tokens

ea7303f verified 3 months ago

|

history blame contribute delete

1.77 kB

	---
	language: en
	license: mit
	tags:
	- spiking-neural-networks
	- language-modeling
	- neuromorphic
	- energy-efficient
	- biological-ai
	datasets:
	- fineweb-5B
	pipeline_tag: text-generation
	---

	# 🧠 Spiking Neural Network Language Model - Training Checkpoint

	Live training checkpoint from the world's first large-scale spiking language model!

	## Current Training Status

	- Training Step: 554,000
	- Tokens Processed: 5.67B tokens
	- Current Loss: 4.5783
	- Spike Rate: 0.0508
	- Learning Rate: 8.15e-05

	## Model Architecture

	- Parameters: ~54M
	- Architecture: 12-layer Spiking LTC Network
	- Hidden Size: 768
	- Sequence Length: 1024
	- Multi-timescale Processing: Fast → Medium → Slow layers

	## Training Details

	- Dataset: PatrickHaller/fineweb-5B
	- Target: 3 epochs (~15B tokens total)
	- Biological Dynamics: Adaptive thresholds, refractory periods
	- Energy Efficiency: ~5% neuron activation vs 100% in Transformers

	## Scientific Significance

	This represents ongoing training of the first large-scale spiking neural network for language modeling, demonstrating:

	1. Biological neural dynamics can learn language at scale
	2. Energy efficiency through sparse neural firing
	3. Multi-timescale processing for hierarchical understanding

	## Usage

	```python
	# Download this checkpoint
	from huggingface_hub import hf_hub_download
	checkpoint = hf_hub_download(
	repo_id="rootxhacker/piking-llm-5b-3epochs-exp",
	filename="checkpoint_554000.pt"
	)

	# Load with custom spiking model code
	# (See full implementation for complete usage)
	```

	---

	🔬 This is live research in progress! Check back for updates as training continues.

	Training Progress: 37.8% complete towards 15B tokens