matlok
's Collections
Papers - Interpretability
updated
Prompt-to-Prompt Image Editing with Cross Attention Control
Paper
•
2208.01626
•
Published
•
2
BERT Rediscovers the Classical NLP Pipeline
Paper
•
1905.05950
•
Published
•
2
A Multiscale Visualization of Attention in the Transformer Model
Paper
•
1906.05714
•
Published
•
2
Analyzing Transformers in Embedding Space
Paper
•
2209.02535
•
Published
•
3
LVLM-Intrepret: An Interpretability Tool for Large Vision-Language
Models
Paper
•
2404.03118
•
Published
•
24
The Geometry of Categorical and Hierarchical Concepts in Large Language
Models
Paper
•
2406.01506
•
Published
•
3
Confidence Regulation Neurons in Language Models
Paper
•
2406.16254
•
Published
•
10
Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting
Rare Concepts in Foundation Models
Paper
•
2411.00743
•
Published
•
6
Do I Know This Entity? Knowledge Awareness and Hallucinations in
Language Models
Paper
•
2411.14257
•
Published
•
9
Paper
•
2412.09764
•
Published
•
3