AgoraResearch hub
ExploreLibraryProfile
Account
Senthooran Rajamanoharan
Scholar

Senthooran Rajamanoharan

Google Scholar ID: W4JthzgAAAAJ
Google DeepMind
Mechanistic InterpretabilityMachine Learning
Google Scholar↗
Citations & Impact
All-time
Citations
709
 
H-index
11
 
i10-index
11
 
Publications
20
 
Co-authors
18
list available
Contact
No contact links provided.
Publications
13 items
Emergent Misalignment is Easy, Narrow Misalignment is Hard
2026
Cited
0
Thought Branches: Interpreting LLM Reasoning Requires Resampling
2025
Cited
0
Eliciting Secret Knowledge from Language Models
2025
Cited
0
Steering Out-of-Distribution Generalization with Concept Ablation Fine-Tuning
2025
Cited
0
When Chain of Thought is Necessary, Language Models Struggle to Evade Monitors
2025
Cited
0
Dense SAE Latents Are Features, Not Bugs
2025
Cited
0
Convergent Linear Representations of Emergent Misalignment
2025
Cited
1
Model Organisms for Emergent Misalignment
2025
Cited
0
Resume (English only)
Co-authors
18 total
Neel Nanda
Neel Nanda
Mechanistic Interpretability Team Lead, Google DeepMind
Arthur Conmy
Arthur Conmy
Google DeepMind
Co-author 3
Co-author 3
Vikrant Varma
Vikrant Varma
DeepMind
Co-author 5
Co-author 5
Rohin Shah
Rohin Shah
Research Scientist, Google DeepMind
Co-author 7
Co-author 7
Brian Williams
Brian Williams
Professor of Aeronautics and Astronautics, Massachusetts Institute of Technology

Welcome back

Sign in to Agora

Welcome back! Please sign in to continue.

Do not have an account?