--- license: mit --- These are the **first public interpreter models** trained on a true reasoning model, and on **any model of this scale.** Because R1 is a very large model and therefore difficult to run for most independent researchers, we're also uploading SQL databases containing the max activating examples for each feature. ## Model Information This release contains two SAEs, one for general reasoning and one for math. After cloning our [demo repo](https://github.com/goodfire-ai/r1-interpretability/tree/main), you can load them with the following snippet: ```python from sae import load_math_sae from huggingface_hub import hf_hub_download file_path = hf_hub_download( repo_id=f"Goodfire/DeepSeek-R1-SAE-l37", filename=f"math/DeepSeek-R1-SAE-l37.pt", repo_type="model" ) device = "cpu" math_sae = load_math_sae(file_path, device) ``` The general reasoning SAE was trained on R1’s activations on our [custom reasoning dataset](https://huggingface.co/datasets/Goodfire/r1-collect), and the second used [OpenR1-Math](https://huggingface.co/datasets/open-r1/OpenR1-Math-220k), a large dataset for mathematical reasoning. These datasets allow us to discover the features that R1 uses to answer challenging problems that exercise its reasoning chops. Note: the original uploaded version of the logic SAE was incorrect; the correct version was uploaded on 4/17. If you have any difficulty running the SAEs, please reach out to us!