Scholar
Senthooran Rajamanoharan
Google Scholar ID: W4JthzgAAAAJ
Google DeepMind
Mechanistic Interpretability
Machine Learning
Follow
Google Scholar
↗
Citations & Impact
All-time
Citations
709
H-index
11
i10-index
11
Publications
20
Co-authors
18
list available
Contact
No contact links provided.
Publications
13 items
Emergent Misalignment is Easy, Narrow Misalignment is Hard
2026
Cited
0
Thought Branches: Interpreting LLM Reasoning Requires Resampling
2025
Cited
0
Eliciting Secret Knowledge from Language Models
2025
Cited
0
Steering Out-of-Distribution Generalization with Concept Ablation Fine-Tuning
2025
Cited
0
When Chain of Thought is Necessary, Language Models Struggle to Evade Monitors
2025
Cited
0
Dense SAE Latents Are Features, Not Bugs
2025
Cited
0
Convergent Linear Representations of Emergent Misalignment
2025
Cited
1
Model Organisms for Emergent Misalignment
2025
Cited
0
Load more
Resume (English only)
Co-authors
18 total
Neel Nanda
Mechanistic Interpretability Team Lead, Google DeepMind
Arthur Conmy
Google DeepMind
Co-author 3
Vikrant Varma
DeepMind
Co-author 5
Rohin Shah
Research Scientist, Google DeepMind
Co-author 7
Brian Williams
Professor of Aeronautics and Astronautics, Massachusetts Institute of Technology
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up