Ryan Carey
Scholar

Ryan Carey

Google Scholar ID: 9U1CpcAAAAAJ
University of Oxford
AI SafetyCausalityIncentives
Citations & Impact
All-time
Citations
271
 
H-index
9
 
i10-index
9
 
Publications
19
 
Co-authors
1
list available
Resume (English only)
Academic Achievements
  • "Human Control: Definitions and Algorithms" (UAI 2023): Studies definitions of human control (e.g., corrigibility, alignment), their guarantees for human autonomy, and associated algorithms.
  • "Reasoning about Causality in Games" (Artificial Intelligence Journal 2023): Introduces structural causal games as a unified framework for causal and game-theoretic reasoning.
  • "Path-Specific Objectives for Safer Agent Incentives" (AAAI 2022): Addresses how to optimize objectives without undesirable means (e.g., user manipulation).
  • "A Complete Criterion for Value of Information in Soluble Influence Diagrams" (AAAI 2022): Provides a complete graphical criterion for value of information in multi-decision influence diagrams.
  • "Why Fair Labels Can Yield Unfair Predictions" (AAAI 2022): Shows how unfairness can be incentivized even with perfectly fair labels, with graphical conditions.
  • "Agent Incentives: A Causal Perspective" (AAAI 2021): Presents sound and complete graphical criteria for four types of agent incentives.
  • "Incorrigibility in the CIRL Framework" (AIES 2018): Analyzes how Cooperative Inverse Reinforcement Learning may fail to prevent incorrigible behavior.
Research Experience
  • Research Fellow at the Future of Humanity Institute.
  • Research Intern at DeepMind.
  • Research Intern at OpenAI.
  • Founder of the EA Forum (Effective Altruism Forum).
  • Co-founder of the Causal Incentives Working Group, which applies causal models to AI safety.