rootxhacker commited on
Commit
3124298
·
verified ·
1 Parent(s): 301babd

Training checkpoint 2000 - 0.0B tokens

Browse files
Files changed (3) hide show
  1. README.md +68 -0
  2. checkpoint_2000.pt +3 -0
  3. config.json +14 -0
README.md ADDED
@@ -0,0 +1,68 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: mit
4
+ tags:
5
+ - spiking-neural-networks
6
+ - language-modeling
7
+ - neuromorphic
8
+ - energy-efficient
9
+ - biological-ai
10
+ datasets:
11
+ - fineweb-5B
12
+ pipeline_tag: text-generation
13
+ ---
14
+
15
+ # 🧠 Spiking Neural Network Language Model - Training Checkpoint
16
+
17
+ **Live training checkpoint from the world's first large-scale spiking language model!**
18
+
19
+ ## Current Training Status
20
+
21
+ - **Training Step**: 2,000
22
+ - **Tokens Processed**: 0.02B tokens
23
+ - **Current Loss**: 9.8659
24
+ - **Spike Rate**: 0.0115
25
+ - **Learning Rate**: 1.02e-05
26
+
27
+ ## Model Architecture
28
+
29
+ - **Parameters**: ~54M
30
+ - **Architecture**: 12-layer Spiking LTC Network
31
+ - **Hidden Size**: 768
32
+ - **Sequence Length**: 1024
33
+ - **Multi-timescale Processing**: Fast → Medium → Slow layers
34
+
35
+ ## Training Details
36
+
37
+ - **Dataset**: PatrickHaller/fineweb-5B
38
+ - **Target**: 3 epochs (~15B tokens total)
39
+ - **Biological Dynamics**: Adaptive thresholds, refractory periods
40
+ - **Energy Efficiency**: ~5% neuron activation vs 100% in Transformers
41
+
42
+ ## Scientific Significance
43
+
44
+ This represents ongoing training of the first large-scale spiking neural network for language modeling, demonstrating:
45
+
46
+ 1. **Biological neural dynamics** can learn language at scale
47
+ 2. **Energy efficiency** through sparse neural firing
48
+ 3. **Multi-timescale processing** for hierarchical understanding
49
+
50
+ ## Usage
51
+
52
+ ```python
53
+ # Download this checkpoint
54
+ from huggingface_hub import hf_hub_download
55
+ checkpoint = hf_hub_download(
56
+ repo_id="rootxhacker/piking-llm-5b-3epochs-exp",
57
+ filename="checkpoint_2000.pt"
58
+ )
59
+
60
+ # Load with custom spiking model code
61
+ # (See full implementation for complete usage)
62
+ ```
63
+
64
+ ---
65
+
66
+ **🔬 This is live research in progress! Check back for updates as training continues.**
67
+
68
+ **Training Progress**: 0.1% complete towards 15B tokens
checkpoint_2000.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a2525c357147a84397048617fc112cfaebe7389e44328d60cac9eb224f13894d
3
+ size 999026318
config.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_type": "spiking_llm",
3
+ "vocab_size": 50257,
4
+ "hidden_size": 768,
5
+ "num_layers": 12,
6
+ "max_seq_length": 1024,
7
+ "training_step": 2000,
8
+ "tokens_processed": 20480000,
9
+ "loss": 9.865856721571749,
10
+ "spike_rate": 0.011451726761236444,
11
+ "learning_rate": 1.019965406348427e-05,
12
+ "epoch": 0.004096,
13
+ "progress_percent": 0.13653368285956147
14
+ }