🤖 AI Summary
This work addresses preference learning in multi-agent games where a coordinator, unaware of participants’ utility functions, must infer preferences through action recommendations and observed compliance behavior. Integrating game theory, online learning, and statistical learning theory, the paper proposes a low-regret online recommendation algorithm. It establishes a novel theory of utility learnability based on compliance in unknown games: under a quantal response model, it achieves logarithmic-sample-efficient identification of utility functions up to positive affine equivalence; under a best-response model, it fully characterizes the geometric structure of the identifiable set. The proposed algorithm attains regret bounds that scale linearly with the game’s dimension and logarithmically with time under both feedback models.
📝 Abstract
We study preference learning through recommendations in multi-agent game settings, where a moderator repeatedly interacts with agents whose utility functions are unknown. In each round, the moderator issues action recommendations and observes whether agents follow or deviate from them. We consider two canonical behavioral feedback models-best response and quantal response-and study how the information revealed by each model affects the learnability of agents' utilities. We show that under quantal-response feedback the game is learnable, up to a positive affine equivalence class, with logarithmic sample complexity in the desired precision, whereas best-response feedback can only identify a larger set of agents' utilities. We give a complete geometric characterization of this set. Moreover, we introduce a regret notion based on agents' incentives to deviate from recommendations and design an online algorithm with low regret under both feedback models, with bounds scaling linearly in the game dimension and logarithmically in time. Our results lay a theoretical foundation for AI recommendation systems in strategic multi-agent environments, where recommendation compliances are shaped by strategic interaction.