🤖 AI Summary
This work addresses the limited robustness of classical mean-field games under uncertainty in the initial population distribution and the assumption of full rationality. To overcome these limitations, the authors propose a novel equilibrium concept—Mean-Field Risk-Averse Quantal Response Equilibrium (MF-RQE)—which integrates risk aversion and bounded rationality by modeling agents’ ambiguity aversion toward the initial distribution and their suboptimal decision-making. To compute MF-RQE policies in large-scale state-action spaces, an efficient and scalable algorithm is developed, combining fixed-point iteration, fictitious play dynamics, and entropy-regularized reinforcement learning. Numerical experiments demonstrate that the proposed approach significantly outperforms standard mean-field game solutions under distributional perturbations, exhibiting enhanced robustness and stability.
📝 Abstract
Recent advances in mean-field game literature enable the reduction of large-scale multi-agent problems to tractable interactions between a representative agent and a population distribution. However, existing approaches typically assume a fixed initial population distribution and fully rational agents, limiting robustness under distributional uncertainty and cognitive constraints. We address these limitations by introducing risk aversion with respect to the initial population distribution and by incorporating bounded rationality to model deviations from fully rational decision-making agents. The combination of these two elements yields a new and more general equilibrium concept, which we term the mean-field risk-averse quantal response equilibrium (MF-RQE). We establish existence results and prove convergence of fixed-point iteration and fictitious play to MF-RQE. Building on these insights, we develop a scalable reinforcement learning algorithm for scenarios with large state-action spaces. Numerical experiments demonstrate that MF-RQE policies achieve improved robustness relative to classical mean-field approaches that optimize expected cumulative rewards under a fixed initial distribution and are restricted to entropy-based regularizers.