LLaMA-3.1-8B Cognitive Actions SAE
This is a Sparse Autoencoder (SAE) trained on layer 11 activations from LLaMA-3.1-8B-Instruct using the FAST methodology.
Model Details
- Base Model: meta-llama/Llama-3.1-8B-Instruct
- Layer: 11
- Dataset: Cognitive Actions (7K examples)
- SAE Architecture: M=256, K=8
- Methodology: FAST (Finetuning-aligned Sequential Training)
Performance
- MSE: 0.0065
- Normalized MSE: 0.0140
- Active features/token: 7.99
- Dead neurons: 0.00%
Usage
from hypothesaes.sae import load_model
sae = load_model("Koalacrown/llama3.1-8b-it-cognitive-actions-sae-l11")
features = sae.get_activations(activations)
Training
Trained using HypotheSAEs with the following configuration:
- Epochs: 100
- Batch size: 512
- Learning rate: 0.0005
- Matryoshka prefixes: [64, 256]
Citation
If you use this SAE, please cite the FAST methodology and HypotheSAEs.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support