Update README.md
Browse files
    	
        README.md
    ADDED
    
    | @@ -0,0 +1,75 @@ | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            ---
         | 
| 2 | 
            +
            language: en
         | 
| 3 | 
            +
            tags:
         | 
| 4 | 
            +
            - clip
         | 
| 5 | 
            +
            - vision
         | 
| 6 | 
            +
            - transformers
         | 
| 7 | 
            +
            - interpretability
         | 
| 8 | 
            +
            - sparse autoencoder
         | 
| 9 | 
            +
            - sae
         | 
| 10 | 
            +
            - mechanistic interpretability
         | 
| 11 | 
            +
            license: apache-2.0
         | 
| 12 | 
            +
            library_name: torch
         | 
| 13 | 
            +
            pipeline_tag: feature-extraction
         | 
| 14 | 
            +
            metrics:
         | 
| 15 | 
            +
            - type: explained_variance 
         | 
| 16 | 
            +
              value: 98.5
         | 
| 17 | 
            +
              pretty_name: Explained Variance %
         | 
| 18 | 
            +
              range:
         | 
| 19 | 
            +
                min: 0
         | 
| 20 | 
            +
                max: 100
         | 
| 21 | 
            +
            - type: l0
         | 
| 22 | 
            +
              value: 1408.150
         | 
| 23 | 
            +
              pretty_name: L0 
         | 
| 24 | 
            +
            ---
         | 
| 25 | 
            +
             | 
| 26 | 
            +
            # CLIP-B-32 Sparse Autoencoder x64 vanilla - L1:1e-05
         | 
| 27 | 
            +
             | 
| 28 | 
            +
            
         | 
| 29 | 
            +
            
         | 
| 30 | 
            +
             | 
| 31 | 
            +
            ### Training Details
         | 
| 32 | 
            +
             | 
| 33 | 
            +
            - Base Model: CLIP-ViT-B-32 (LAION DataComp.XL-s13B-b90K)
         | 
| 34 | 
            +
            - Layer: 11
         | 
| 35 | 
            +
            - Component: hook_resid_post
         | 
| 36 | 
            +
             | 
| 37 | 
            +
            ### Model Architecture
         | 
| 38 | 
            +
             | 
| 39 | 
            +
            - Input Dimension: 768
         | 
| 40 | 
            +
            - SAE Dimension: 49,152
         | 
| 41 | 
            +
            - Expansion Factor: x64 (vanilla architecture)
         | 
| 42 | 
            +
            - Activation Function: ReLU
         | 
| 43 | 
            +
            - Initialization: encoder_transpose_decoder
         | 
| 44 | 
            +
            - Context Size: 50 tokens
         | 
| 45 | 
            +
             | 
| 46 | 
            +
            ### Performance Metrics
         | 
| 47 | 
            +
             | 
| 48 | 
            +
            - L1 Coefficient: 1e-05
         | 
| 49 | 
            +
            - L0 Sparsity: 1408.1500
         | 
| 50 | 
            +
            - Explained Variance: 0.9848 (98.48%)
         | 
| 51 | 
            +
             | 
| 52 | 
            +
            ### Training Configuration
         | 
| 53 | 
            +
             | 
| 54 | 
            +
            - Learning Rate: 0.0004
         | 
| 55 | 
            +
            - LR Scheduler: Cosine Annealing with Warmup (200 steps)
         | 
| 56 | 
            +
            - Epochs: 10
         | 
| 57 | 
            +
            - Gradient Clipping: 1.0
         | 
| 58 | 
            +
            - Device: NVIDIA Quadro RTX 8000
         | 
| 59 | 
            +
             | 
| 60 | 
            +
            **Experiment Tracking:**
         | 
| 61 | 
            +
            - Weights & Biases Run ID: j4bop08g
         | 
| 62 | 
            +
            - Full experiment details: https://wandb.ai/perceptual-alignment/clip/runs/j4bop08g/overview
         | 
| 63 | 
            +
            - Git Commit: e22dd02726b74a054a779a4805b96059d83244aa
         | 
| 64 | 
            +
             | 
| 65 | 
            +
            ## Citation
         | 
| 66 | 
            +
             | 
| 67 | 
            +
            ```bibtex
         | 
| 68 | 
            +
            @misc{2024josephsparseautoencoders,
         | 
| 69 | 
            +
                title={Sparse Autoencoders for CLIP-ViT-B-32},
         | 
| 70 | 
            +
                author={Joseph, Sonia},
         | 
| 71 | 
            +
                year={2024},
         | 
| 72 | 
            +
                publisher={Prisma-Multimodal},
         | 
| 73 | 
            +
                url={https://huggingface.co/Prisma-Multimodal},
         | 
| 74 | 
            +
                note={Layer 11, hook_resid_post, Run ID: j4bop08g}
         | 
| 75 | 
            +
            }
         |