Scholar
Wes Gurnee
Google Scholar ID: 5sxXSfwAAAAJ
Anthropic
Mechanistic Interpretability
AI Alignment
Optimization
Governance
Follow
Homepage
↗
Google Scholar
↗
Citations & Impact
All-time
Citations
1,536
H-index
12
i10-index
13
Publications
18
Co-authors
14
list available
Contact
No contact links provided.
Publications
4 items
Emotion Concepts and their Function in a Large Language Model
2026
Cited
0
When Models Manipulate Manifolds: The Geometry of a Counting Task
arXiv.org · 2026
Cited
10
The Remarkable Robustness of LLMs: Stages of Inference?
arXiv.org · 2024
Cited
48
Not All Language Model Features Are One-Dimensionally Linear
2024
Cited
40
Resume (English only)
Co-authors
14 total
Neel Nanda
Mechanistic Interpretability Team Lead, Google DeepMind
Dimitris Bertsimas
Boeing Professor of Operations Research, MIT
Max Tegmark
Professor of Physics, MIT
Andy Arditi
Northeastern University
Nina Panickssery
Anthropic
Co-author 6
Isaac Liao
Carnegie Mellon University
Joshua Engels
Google Deepmind
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up