🤖 AI Summary
Multi-agent reinforcement learning (MARL) faces two key robustness bottlenecks in simulation-to-reality transfer: poor adaptability to environmental dynamics and exponential sample complexity growth in the number of agents—the “multi-agent curse.”
Method: We propose a behavioral-economics-inspired Distributionally Robust Markov Game (DR-MG) framework, the first to jointly model environmental uncertainty and joint-policy uncertainty via a structured distributionally robust set.
Contributions/Results: Theoretically, we establish—for the first time—the existence of robust Nash equilibria and coarse correlated equilibria (CCE) under DR-MG. Algorithmically, we design the first MARL algorithm with polynomial sample complexity in the number of agents (Poly(N)), breaking the multi-agent curse. By integrating distributionally robust optimization, behavioral game-theoretic modeling, and generative-model-assisted sampling, our approach ensures worst-case performance guarantees while enabling scalable, robust learning. This work delivers the first solution for MARL that is both theoretically rigorous and practically scalable.
📝 Abstract
Standard multi-agent reinforcement learning (MARL) algorithms are vulnerable to sim-to-real gaps. To address this, distributionally robust Markov games (RMGs) have been proposed to enhance robustness in MARL by optimizing the worst-case performance when game dynamics shift within a prescribed uncertainty set. RMGs remains under-explored, from reasonable problem formulation to the development of sample-efficient algorithms. Two notorious and open challenges are the formulation of the uncertainty set and whether the corresponding RMGs can overcome the curse of multiagency, where the sample complexity scales exponentially with the number of agents. In this work, we propose a natural class of RMGs inspired by behavioral economics, where each agent's uncertainty set is shaped by both the environment and the integrated behavior of other agents. We first establish the well-posedness of this class of RMGs by proving the existence of game-theoretic solutions such as robust Nash equilibria and coarse correlated equilibria (CCE). Assuming access to a generative model, we then introduce a sample-efficient algorithm for learning the CCE whose sample complexity scales polynomially with all relevant parameters. To the best of our knowledge, this is the first algorithm to break the curse of multiagency for RMGs, regardless of the uncertainty set formulation.