Sparse Transcoders for Qwen2.5-VL-7B Released
#1
by
KokosDev
- opened
Hi everyone! π
I've released sparse transcoders for Qwen2.5-VL-7B-Instruct for mechanistic interpretability research.
What's Included
- L0-L26 transcoders (one per decoder layer)
- 8,192 sparse features per layer
- Cross-layer architecture (PLT)
- Full code and documentation
- Apache 2.0 license
Use Cases
- Feature discovery and analysis
- Circuit analysis
- Bias detection research
- Model steering experiments
- Feature suppression or amplification
Quick Start
from safetensors.torch import load_file
transcoder = load_file("transcoder_L25.safetensors")
# Hook into Qwen2.5-VL and extract features
See the README for complete usage examples.
Next: Working on 32B version! π₯
Repository: https://huggingface.co/KokosDev/qwen2p5vl-7b-plt
Questions and feedback welcome!