🤖 AI Summary
This paper addresses the problem of inverse utility learning for multiple agents in repeated normal-form games, without prior knowledge of their utility functions, by observing their behavioral responses—including signal transmission and payoff reactions. We propose the first payment-mechanism-guided active learning framework that rigorously distinguishes and models two distinct behavioral paradigms: iterative dominance elimination and no-regret learning. Our method introduces a state-dependent payment and signaling mechanism, coupled with a polynomial-round interactive learning algorithm. Theoretically, we prove that all agents’ utility functions can be approximated to arbitrary accuracy $varepsilon$ within $O( ext{poly}(1/varepsilon))$ rounds, with a tight lower bound; moreover, learning efficiency is provably superior under the iterative elimination model compared to the no-regret model. These results enable the first equilibrium-guidance algorithm for games that requires no utility prior.
📝 Abstract
We study the problem of learning the utility functions of agents in a normal-form game by observing the agents play the game repeatedly. Differing from most prior literature, we introduce a principal with the power to observe the agents playing the game, send the agents signals, and send the agents payments as a function of their actions. Under reasonable behavioral models for the agents such as iterated dominated action removal or a no-regret assumption, we show that the principal can, using a number of rounds polynomial in the size of the game, learn the utility functions of all agents to any desirable precision $varepsilon>0$. We also show lower bounds in both models, which nearly match the upper bounds in the former model and also strictly separate the two models: the principal can learn strictly faster in the iterated dominance model. Finally, we discuss implications for the problem of steering agents to a desired equilibrium: in particular, we introduce, using our utility-learning algorithm as a subroutine, the first algorithm for steering learning agents without prior knowledge of their utilities.