🤖 AI Summary
This study addresses the problem of inferring unknown and unstructured incentive mechanisms—such as those represented by neural networks—from observed learning trajectories of self-interested multi-agent systems. To this end, the authors propose the Differentiable Inverse Multi-Agent Learning (DIML) framework, which models differentiable multi-agent learning dynamics and integrates a conditional Logit response model with maximum likelihood estimation to enable end-to-end recovery of the underlying incentives. This work presents the first differentiable inverse learning approach capable of handling unstructured incentive functions, establishes identifiability conditions for incentive differences, and proves the statistical consistency of the resulting estimator. Empirical evaluations demonstrate that DIML accurately recovers identifiable incentive structures, supports counterfactual prediction, matches the performance of an enumeration oracle in small-scale settings, and scales effectively to games involving hundreds of agents.
📝 Abstract
We study inverse mechanism learning: recovering an unknown incentive-generating mechanism from observed strategic interaction traces of self-interested learning agents. Unlike inverse game theory and multi-agent inverse reinforcement learning, which typically infer utility/reward parameters inside a structured mechanism, our target includes unstructured mechanism -- a (possibly neural) mapping from joint actions to per-agent payoffs. Unlike differentiable mechanism design, which optimizes mechanisms forward, we infer mechanisms from behavior in an observational setting. We propose DIML, a likelihood-based framework that differentiates through a model of multi-agent learning dynamics and uses the candidate mechanism to generate counterfactual payoffs needed to predict observed actions. We establish identifiability of payoff differences under a conditional logit response model and prove statistical consistency of maximum likelihood estimation under standard regularity conditions. We evaluate DIML with simulated interactions of learning agents across unstructured neural mechanisms, congestion tolling, public goods subsidies, and large-scale anonymous games. DIML reliably recovers identifiable incentive differences and supports counterfactual prediction, where its performance rivals tabular enumeration oracle in small environments and its convergence scales to large, hundred-participant environments. Code to reproduce our experiments is open-sourced.