The Incentives that Shape Behaviour

📅 2020-01-20

🏛️ arXiv.org

📈 Citations: 13

✨ Influential: 0

career value

220K/year

🤖 AI Summary

This work addresses the modeling of incentives in agent decision-making, systematically distinguishing *response incentives* (how environmental variables affect the optimal policy), *instrumental control incentives* (whether the agent actively manipulates environment variables, e.g., user preferences), and *influence incentives* (variables the agent alters—intentionally or unintentionally). We propose the Structural Causal Influence Model (SCIM), the first framework unifying influence diagrams with structural causal models. Building on causal reasoning, graph theory, and formal modeling, we derive the first decidable graphical criteria for identifying and classifying all three incentive types in single-decision settings. Our approach enables precise, theoretically grounded incentive attribution, significantly enhancing the interpretability and controllability of agent behavior. Empirically, it improves predictive accuracy of behavioral tendencies in fairness-critical and AI safety–sensitive applications, supporting more robust and transparent autonomous decision-making.

📝 Abstract

Which variables does an agent have an incentive to control with its decision, and which variables does it have an incentive to respond to? We formalise these incentives, and demonstrate unique graphical criteria for detecting them in any single decision causal influence diagram. To this end, we introduce structural causal influence models, a hybrid of the influence diagram and structural causal model frameworks. Finally, we illustrate how these incentives predict agent incentives in both fairness and AI safety applications.

Problem

Research questions and friction points this paper is trying to address.

Identify variables affecting agent decisions under optimal policy

Determine if agent policies manipulate environment or user preferences

Assess variables an agent affects intentionally or unintentionally

Innovation

Methods, ideas, or system contributions that make the work stand out.

Graphical criteria for agent incentives

Techniques for safe agent behavior

Generalization to multi-decision settings

🔎 Similar Papers

Segment Discovery: Enhancing E-commerce Targeting