LLaMA-3.1-8B Cognitive Actions SAE

This is a Sparse Autoencoder (SAE) trained on layer 11 activations from LLaMA-3.1-8B-Instruct using the FAST methodology.

Model Details

  • Base Model: meta-llama/Llama-3.1-8B-Instruct
  • Layer: 11
  • Dataset: Cognitive Actions (7K examples)
  • SAE Architecture: M=256, K=8
  • Methodology: FAST (Finetuning-aligned Sequential Training)

Performance

  • MSE: 0.0065
  • Normalized MSE: 0.0140
  • Active features/token: 7.99
  • Dead neurons: 0.00%

Usage

from hypothesaes.sae import load_model

sae = load_model("Koalacrown/llama3.1-8b-it-cognitive-actions-sae-l11")
features = sae.get_activations(activations)

Training

Trained using HypotheSAEs with the following configuration:

  • Epochs: 100
  • Batch size: 512
  • Learning rate: 0.0005
  • Matryoshka prefixes: [64, 256]

Citation

If you use this SAE, please cite the FAST methodology and HypotheSAEs.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support