🤖 AI Summary
This paper addresses equilibrium selection in finite normal-form games by proposing a multi-agent learning dynamic grounded in statistical hypothesis testing, designed to converge to a Nash equilibrium that maximizes the minimum (transformed) utility across all players. Methodologically, agents periodically test hypotheses about opponents’ strategies, updating their own policies via empirical observations and belief resampling, while incorporating a utility-dependent exploration decay mechanism to jointly optimize belief refinement and exploration. The key contribution is the first integration of statistical hypothesis testing into game-theoretic learning—enabling endogenous selection of highly robust equilibria without external refinement criteria. Theoretically and empirically, the algorithm converges to an approximate Nash equilibrium set in general finite games and consistently favors solutions that improve the global minimum utility. This provides a novel, interpretable, and adaptive paradigm for equilibrium selection.
📝 Abstract
We introduce a new hypothesis testing-based learning dynamics in which players update their strategies by combining hypothesis testing with utility-driven exploration. In this dynamics, each player forms beliefs about opponents' strategies and episodically tests these beliefs using empirical observations. Beliefs are resampled either when the hypothesis test is rejected or through exploration, where the probability of exploration decreases with the player's (transformed) utility. In general finite normal-form games, we show that the learning process converges to a set of approximate Nash equilibria and, more importantly, to a refinement that selects equilibria maximizing the minimum (transformed) utility across all players. Our result establishes convergence to equilibrium in general finite games and reveals a novel mechanism for equilibrium selection induced by the structure of the learning dynamics.