intuition behind layer selection

#1
by ada-flo - opened

Hello, thank you for your inspiring work! We were just wondering if there's any sharable insight behind your selection of layer 137?

Is it backed up by any numerical analysis? We are doing a similar line of research and would greatly benefit from your insights. Thank you!

Goodfire org

it's actually layer 37! I don't think there was much analysis that went into this - iirc we just took a layer that was approximately 2/3s of the way through the model, since that often works well for SAEs in LLMs

ada-flo changed discussion status to closed

Sign up or log in to comment