Policy Learning with Competing Agents

📅 2022-04-04

🏛️ arXiv.org

📈 Citations: 10

✨ Influential: 0

career value

209K/year

🤖 AI Summary

This paper addresses the treatment allocation problem under capacity constraints with strategic agent responses: conventional policy learning fails due to competitive interference when heterogeneous agents strategically modify their behavior to maximize individual utility. To address this, we formulate a dynamic game model and introduce mean-field approximation to characterize large-scale interactions. We further propose the first consistent policy gradient estimation framework for strategic responses under capacity constraints. Theoretically, we prove that threshold-type optimal policies converge to the mean-field equilibrium threshold as population size grows. Semi-synthetic experiments on the NELS-88 dataset demonstrate that our method significantly improves causal intervention efficacy compared to standard policy learning approaches that ignore strategic competition.

📝 Abstract

Decision makers often aim to learn a treatment assignment policy under a capacity constraint on the number of agents that they can treat. When agents can respond strategically to such policies, competition arises, complicating estimation of the optimal policy. In this paper, we study capacity-constrained treatment assignment in the presence of such interference. We consider a dynamic model where the decision maker allocates treatments at each time step and heterogeneous agents myopically best respond to the previous treatment assignment policy. When the number of agents is large but finite, we show that the threshold for receiving treatment under a given policy converges to the policy's mean-field equilibrium threshold. Based on this result, we develop a consistent estimator for the policy gradient. In a semi-synthetic experiment with data from the National Education Longitudinal Study of 1988, we demonstrate that this estimator can be used for learning capacity-constrained policies in the presence of strategic behavior.

Problem

Research questions and friction points this paper is trying to address.

Learning treatment assignment under capacity constraints

Addressing strategic agent competition in policy estimation

Developing consistent estimators for optimal policy gradients

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mean-field equilibrium threshold convergence

Consistent policy gradient estimator development

Semi-synthetic experiment validation strategic behavior

🔎 Similar Papers

Contrastive learning-based agent modeling for deep reinforcement learning