Scholar

Tom Everitt

Google Scholar ID: BdulyjIAAAAJ

Staff Research Scientist at Google DeepMind

AI SafetyArtificial General IntelligenceCausalityIncentives

Citations & Impact

All-time

Citations

2,699

H-index

i10-index

Publications

Co-authors

list available

Contact

Publications

5 items

2025

Cited

2025

Cited

2025

Cited

2025

Cited

arXiv.org · 2020

Cited

Resume (English only)

Academic Achievements

Paper 'Robust agents learn causal world models' received ICLR 2024 Oral presentation and Honorable Mention Outstanding Paper Award
Published 'Evaluating the Goal-Directedness of Large Language Models', introducing an empirically predictive and cross-task consistent measure of LLM goal-directedness
Co-authored 'The Reasons that Agents Act: Intention and Instrumental Goals' (AAMAS 2024), formalizing intent in causal models
Co-wrote 'AGI Safety Literature Review' (IJCAI 2018), a comprehensive survey of the AGI safety field
Co-developed 'AI Safety Gridworlds' (2017), making AGI safety problems concrete through testable environments
Proposed modeling AGI safety frameworks using causal influence diagrams (IJCAI AI Safety Workshop, 2019)
Developed a general method to infer agent incentives directly from graphical models, notably in 'Agent Incentives: A Causal Perspective' (AAAI 2021)
Conducted foundational AGI safety research based on the UAI/AIXI framework (e.g., 2016 paper with Marcus Hutter)

Background

Staff Research Scientist at Google DeepMind
Focuses on AGI Safety—how to safely build and use highly intelligent AI systems
Authored the first PhD thesis specifically on AGI safety: 'Towards Safe Artificial General Intelligence'
Currently exploring AGI safety approaches based on amplification of human agency
Led the Causal Incentives Working Group, developing alignment theory grounded in Pearlian causality

Co-authors

55 total