🤖 AI Summary
This work addresses the low sample and computational efficiency of multi-agent reinforcement learning (MARL) in cooperative linear quadratic regulator (LQR) tasks. We propose a graph-structured, exact decomposition of local Q-functions that explicitly models inter-agent coupling via the underlying interaction graph. To our knowledge, this is the first method to achieve structured modeling of coupling information, with a rigorous theoretical derivation establishing sample-complexity optimality and providing necessary and sufficient graph-theoretic conditions for improved sampling efficiency. Technically, the approach integrates approximate least-squares policy iteration, graph-driven Q-function decomposition, two distributed neural network architectures, and a structured learning mechanism. Experiments demonstrate that the algorithm achieves the centralized performance lower bound even in worst-case settings, while significantly improving both sample and computational efficiency—thereby empirically validating the fundamental gains afforded by explicit coupling structure exploitation.
📝 Abstract
Developing scalable and efficient reinforcement learning algorithms for cooperative multi-agent control has received significant attention over the past years. Existing literature has proposed inexact decompositions of local Q-functions based on empirical information structures between the agents. In this paper, we exploit inter-agent coupling information and propose a systematic approach to exactly decompose the local Q-function of each agent. We develop an approximate least square policy iteration algorithm based on the proposed decomposition and identify two architectures to learn the local Q-function for each agent. We establish that the worst-case sample complexity of the decomposition is equal to the centralized case and derive necessary and sufficient graphical conditions on the inter-agent couplings to achieve better sample efficiency. We demonstrate the improved sample efficiency and computational efficiency on numerical examples.