🤖 AI Summary
This work addresses the coordination challenge in large-scale heterogeneous multi-agent reinforcement learning, where the joint state-action space suffers from exponential growth. To tackle this, the authors propose a scalable approach based on graphon modeling. By subsampling κ agents according to interaction intensity and integrating graphon theory with mean-field approximation, they construct a weighted mean-field policy that preserves heterogeneous interaction structures while substantially reducing computational complexity. This study presents the first integration of graphon theory with mean-field reinforcement learning, providing theoretical guarantees that the sample complexity scales polynomially with κ and the optimality gap is O(1/√κ). Empirical evaluations in robotic coordination simulations demonstrate near-optimal collaborative performance.
📝 Abstract
Coordinating large populations of interacting agents is a central challenge in multi-agent reinforcement learning (MARL), where the size of the joint state-action space scales exponentially with the number of agents. Mean-field methods alleviate this burden by aggregating agent interactions, but these approaches assume homogeneous interactions. Recent graphon-based frameworks capture heterogeneity, but are computationally expensive as the number of agents grows. Therefore, we introduce $\texttt{GMFS}$, a $\textbf{G}$raphon $\textbf{M}$ean-$\textbf{F}$ield $\textbf{S}$ubsampling framework for scalable cooperative MARL with heterogeneous agent interactions. By subsampling $κ$ agents according to interaction strength, we approximate the graphon-weighted mean-field and learn a policy with sample complexity $\mathrm{poly}(κ)$ and optimality gap $O(1/\sqrtκ)$. We verify our theory with numerical simulations in robotic coordination, showing that $\texttt{GMFS}$ achieves near-optimal performance.