Elana
/

InterPLM-esm2-8m

+---
+language:
+  - en
+tags:
+  - protein-language-models
+  - sparse-autoencoder
+license: mit
+---
+# Sparse Autoencoders for ESM-2 (8M)
+Interpret protein language model representations using sparse autoencoders trained on ESM-2 (8M) layers. These models decompose complex neural representations into interpretable features, enabling deeper understanding of how protein language models process sequence information.
+* 📊 Model details in the [InterPLM pre-print](https://www.biorxiv.org/content/10.1101/2024.11.14.623630v1)
+* 👩‍💻 Training and analysis code in the [GitHub repo](https://github.com/ElanaPearl/InterPLM)
+* 🧬 Explore features at [interPLM.ai](interplm.ai)
+## Model Details
+- Base Model: ESM-2 8M (6 layers)
+- Architecture: Sparse Autoencoder
+- Input Dimension: 320
+- Feature Dimension: 10,240
+## Available Models
+We provide SAE models trained on different layers of ESM-2-8M:
+| Model name | ESM2 model | ESM2 layer |
+|-|-|-|
+| [InterPLM-esm2-8m-l1](https://huggingface.co/Elana/InterPLM-esm2-8m/tree/main/layer_1) | esm2_t6_8m_UR50D | 1 |
+| [InterPLM-esm2-8m-l2](https://huggingface.co/Elana/InterPLM-esm2-8m/tree/main/layer_2) | esm2_t6_8m_UR50D | 2 |
+| [InterPLM-esm2-8m-l3](https://huggingface.co/Elana/InterPLM-esm2-8m/tree/main/layer_3) | esm2_t6_8m_UR50D | 3 |
+| [InterPLM-esm2-8m-l4](https://huggingface.co/Elana/InterPLM-esm2-8m/tree/main/layer_4) | esm2_t6_8m_UR50D | 4 |
+| [InterPLM-esm2-8m-l5](https://huggingface.co/Elana/InterPLM-esm2-8m/tree/main/layer_5) | esm2_t6_8m_UR50D | 5 |
+| [InterPLM-esm2-8m-l6](https://huggingface.co/Elana/InterPLM-esm2-8m/tree/main/layer_6) | esm2_t6_8m_UR50D | 6 |
+All models share the same architecture and dictionary size (10,240). See [here](https://huggingface.co/Elana/InterPLM-esm2-650m) for SAEs trained on ESM-2 650M. The 650M SAEs capture more known biological concepts than the 8M but require additional compute for both ESM embedding and SAE feature extraction.
+## Usage
+Extract interpretable features from protein sequences:
+```python
+from huggingface_hub import hf_hub_download
+from interplm.sae.inference import load_model
+from interplm.esm.embed import embed_list_of_prot_seqs
+# Select ESM layer (must be one of 1-6)
+layer_num = 4
+# Download and load the model
+weights_path = hf_hub_download(
+    repo_id=f"Elana/InterPLM-esm2-8m",
+    filename=f"layer_{layer_num}/ae_normalized.pt"
+)
+sae = load_model()
+# Get ESM embeddings for protein
+protein_embeddings = embed_single_sequence(sequence="MRWQEMGYIFYPRKLR",
+                                           model_name="esm2_t6_8M_UR50D",
+                                           layer=layer_num)
+# Extract features
+features = sae.encode(protein_embeddings)
+```
+For detailed training and analysis examples, see the [GitHub README](https://github.com/ElanaPearl/InterPLM/blob/main/README.md).
+## Model Variants
+Each layer model is available in two variants:
+- Normalized (`ae_normalized.pt`): Features are L2-normalized before encoding, making the magnitude of activations consistent across different inputs. This can improve interpretability by focusing on relative feature patterns rather than absolute magnitudes. Recommended for most analyses focused on feature interpretation.
+- Unnormalized (`ae_unnormalized.pt`): Raw activation features without normalization. These preserve the original magnitude information from the ESM model, which can be important for tasks where activation strength carries meaningful signal. Use these if you need to analyze absolute activation magnitudes or when combining features with other ESM-based tools.