Neel Nanda (@neelnanda5) 's Twitter Profile
Neel Nanda

@neelnanda5

Mechanistic Interpretability lead @DeepMind. Formerly @AnthropicAI, independent. In this to reduce AI X-risk. Neural networks can be understood, let's go do it!

ID: 1542528075128348674

linkhttp://neelnanda.io calendar_today30-06-2022 15:18:58

2,2K Tweet

17,17K Followers

91 Following

Neel Nanda (@neelnanda5) 's Twitter Profile Photo

I really enjoyed Chris Olah's write-up on defining linear features, and how it relates to eg multidimensional features vs one dimensional SAE features. I felt like it nicely crystallised a bunch of my existing intuitions into words. transformer-circuits.pub/2024/july-upda…