Affine Transform: EleutherAI/deep-ignorance-pretraining-stage-unfiltered (global_step38144) β EleutherAI/deep-ignorance-unfiltered
Learned affine transformation mapping hidden state activations from a source checkpoint to a target model.
Usage
from safetensors.torch import load_file
import torch.nn as nn
from huggingface_hub import hf_hub_download
# Download files
weights_path = hf_hub_download(
repo_id="EleutherAI/affine-checkpoint-transfer",
filename="affine_transforms.safetensors",
)
metadata_path = hf_hub_download(repo_id="EleutherAI/affine-checkpoint-transfer", filename="metadata.json")
# Load
import json
with open(metadata_path) as f:
metadata = json.load(f)
weights = load_file(weights_path)
affine_transforms = {}
for layer_idx in metadata["layer_indices"]:
linear = nn.Linear(metadata["hidden_dim"], metadata["hidden_dim"], bias=True)
linear.weight.data = weights[f"layer_{layer_idx}.weight"]
linear.bias.data = weights[f"layer_{layer_idx}.bias"]
affine_transforms[layer_idx] = linear
MSE Metrics
| Layer | MSE |
|---|---|
| 0 | 0.000050 |
| 1 | 0.014719 |
| 2 | 0.020466 |
| 3 | 0.034601 |
| 4 | 0.056137 |
| 5 | 0.081037 |
| 6 | 0.111370 |
| 7 | 0.169480 |
| 8 | 0.200614 |
| 9 | 0.243166 |
| 10 | 0.289330 |
| 11 | 0.346412 |
| 12 | 0.421382 |
| 13 | 0.506538 |
| 14 | 0.591292 |
| 15 | 0.684350 |
| 16 | 0.774846 |
| 17 | 0.841685 |
| 18 | 0.894353 |
| 19 | 0.938238 |
| 20 | 0.978894 |
| 21 | 1.036667 |
| 22 | 1.101122 |
| 23 | 1.213081 |
| 24 | 1.347362 |
| 25 | 1.569308 |
| 26 | 1.895770 |
| 27 | 2.358280 |
| 28 | 2.917744 |
| 29 | 3.607210 |
| 30 | 4.231404 |
| 31 | 4.862252 |
Mean MSE: 1.073099
Training Details
- Source Model: EleutherAI/deep-ignorance-pretraining-stage-unfiltered (global_step38144)
- Target Model: EleutherAI/deep-ignorance-unfiltered
- Hidden Dimension: 4096
- Ridge Alpha: 0.01
- Layers: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]
- Training Examples: 100000
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support