Scholar

Senthooran Rajamanoharan

Google Scholar ID: W4JthzgAAAAJ

Google DeepMind

Mechanistic InterpretabilityMachine Learning

Google Scholar↗

Citations & Impact

All-time

Citations

709

H-index

11

i10-index

11

Publications

20

Co-authors

18

list available

Contact

No contact links provided.

Publications

14 items

How Well Do Models Follow Their Constitutions?

2026

Cited

0

Emergent Misalignment is Easy, Narrow Misalignment is Hard

2026

Cited

0

Thought Branches: Interpreting LLM Reasoning Requires Resampling

2025

Cited

0

Eliciting Secret Knowledge from Language Models

2025

Cited

0

Steering Out-of-Distribution Generalization with Concept Ablation Fine-Tuning

2025

Cited

0

When Chain of Thought is Necessary, Language Models Struggle to Evade Monitors

2025

Cited

0

Dense SAE Latents Are Features, Not Bugs

2025

Cited

0

Convergent Linear Representations of Emergent Misalignment

2025

Cited

1

Resume (English only)

Co-authors

18 total

Mechanistic Interpretability Team Lead, Google DeepMind

Google DeepMind

Research Scientist, Google DeepMind

Professor of Aeronautics and Astronautics, Massachusetts Institute of Technology