File size: 3,590 Bytes

82bdfea

---
language:
  - en
tags:
  - protein-language-models
  - sparse-autoencoder 
license: mit
---

# Sparse Autoencoders for ESM-2 (8M)

Interpret protein language model representations using sparse autoencoders trained on ESM-2 (8M) layers. These models decompose complex neural representations into interpretable features, enabling deeper understanding of how protein language models process sequence information.

* 📊 Model details in the [InterPLM pre-print](https://www.biorxiv.org/content/10.1101/2024.11.14.623630v1)
* 👩‍💻 Training and analysis code in the [GitHub repo](https://github.com/ElanaPearl/InterPLM) 
* 🧬 Explore features at [interPLM.ai](interplm.ai)

## Model Details
- Base Model: ESM-2 8M (6 layers)
- Architecture: Sparse Autoencoder
- Input Dimension: 320
- Feature Dimension: 10,240

## Available Models

We provide SAE models trained on different layers of ESM-2-8M:

| Model name | ESM2 model | ESM2 layer |
|-|-|-|
| [InterPLM-esm2-8m-l1](https://huggingface.co/Elana/InterPLM-esm2-8m/tree/main/layer_1) | esm2_t6_8m_UR50D | 1 |
| [InterPLM-esm2-8m-l2](https://huggingface.co/Elana/InterPLM-esm2-8m/tree/main/layer_2) | esm2_t6_8m_UR50D | 2 |
| [InterPLM-esm2-8m-l3](https://huggingface.co/Elana/InterPLM-esm2-8m/tree/main/layer_3) | esm2_t6_8m_UR50D | 3 |
| [InterPLM-esm2-8m-l4](https://huggingface.co/Elana/InterPLM-esm2-8m/tree/main/layer_4) | esm2_t6_8m_UR50D | 4 |
| [InterPLM-esm2-8m-l5](https://huggingface.co/Elana/InterPLM-esm2-8m/tree/main/layer_5) | esm2_t6_8m_UR50D | 5 |
| [InterPLM-esm2-8m-l6](https://huggingface.co/Elana/InterPLM-esm2-8m/tree/main/layer_6) | esm2_t6_8m_UR50D | 6 |

All models share the same architecture and dictionary size (10,240). See [here](https://huggingface.co/Elana/InterPLM-esm2-650m) for SAEs trained on ESM-2 650M. The 650M SAEs capture more known biological concepts than the 8M but require additional compute for both ESM embedding and SAE feature extraction.

## Usage

Extract interpretable features from protein sequences:

```python
from huggingface_hub import hf_hub_download
from interplm.sae.inference import load_model
from interplm.esm.embed import embed_list_of_prot_seqs

# Select ESM layer (must be one of 1-6)
layer_num = 4

# Download and load the model
weights_path = hf_hub_download(
    repo_id=f"Elana/InterPLM-esm2-8m",
    filename=f"layer_{layer_num}/ae_normalized.pt"
)
sae = load_model()

# Get ESM embeddings for protein
protein_embeddings = embed_single_sequence(sequence="MRWQEMGYIFYPRKLR",
                                           model_name="esm2_t6_8M_UR50D",
                                           layer=layer_num)

# Extract features
features = sae.encode(protein_embeddings)
```

For detailed training and analysis examples, see the [GitHub README](https://github.com/ElanaPearl/InterPLM/blob/main/README.md).

## Model Variants

Each layer model is available in two variants:

- Normalized (`ae_normalized.pt`): Features are L2-normalized before encoding, making the magnitude of activations consistent across different inputs. This can improve interpretability by focusing on relative feature patterns rather than absolute magnitudes. Recommended for most analyses focused on feature interpretation.

- Unnormalized (`ae_unnormalized.pt`): Raw activation features without normalization. These preserve the original magnitude information from the ESM model, which can be important for tasks where activation strength carries meaningful signal. Use these if you need to analyze absolute activation magnitudes or when combining features with other ESM-based tools.