Neel Nanda's picture

3 2 7

Neel Nanda

NeelNanda

·

https://neelnanda.io

AI & ML interests

Mechanistic Interpretability

Organizations

upvoted a paper over 1 year ago

AtP*: An efficient and scalable method for localizing LLM behaviour to components

Paper • 2403.00745 • Published Mar 1, 2024 • 14

upvoted a paper about 2 years ago

Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla

Paper • 2307.09458 • Published Jul 18, 2023 • 11