Exploiting inter-agent coupling information for efficient reinforcement learning of cooperative LQR

📅 2025-04-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the low sample and computational efficiency of multi-agent reinforcement learning (MARL) in cooperative linear quadratic regulator (LQR) tasks. We propose a graph-structured, exact decomposition of local Q-functions that explicitly models inter-agent coupling via the underlying interaction graph. To our knowledge, this is the first method to achieve structured modeling of coupling information, with a rigorous theoretical derivation establishing sample-complexity optimality and providing necessary and sufficient graph-theoretic conditions for improved sampling efficiency. Technically, the approach integrates approximate least-squares policy iteration, graph-driven Q-function decomposition, two distributed neural network architectures, and a structured learning mechanism. Experiments demonstrate that the algorithm achieves the centralized performance lower bound even in worst-case settings, while significantly improving both sample and computational efficiency—thereby empirically validating the fundamental gains afforded by explicit coupling structure exploitation.

Technology Category

Application Category

📝 Abstract
Developing scalable and efficient reinforcement learning algorithms for cooperative multi-agent control has received significant attention over the past years. Existing literature has proposed inexact decompositions of local Q-functions based on empirical information structures between the agents. In this paper, we exploit inter-agent coupling information and propose a systematic approach to exactly decompose the local Q-function of each agent. We develop an approximate least square policy iteration algorithm based on the proposed decomposition and identify two architectures to learn the local Q-function for each agent. We establish that the worst-case sample complexity of the decomposition is equal to the centralized case and derive necessary and sufficient graphical conditions on the inter-agent couplings to achieve better sample efficiency. We demonstrate the improved sample efficiency and computational efficiency on numerical examples.
Problem

Research questions and friction points this paper is trying to address.

Develop efficient reinforcement learning for multi-agent control
Exploit inter-agent coupling to decompose local Q-functions
Improve sample and computational efficiency in cooperative LQR
Innovation

Methods, ideas, or system contributions that make the work stand out.

Exact decomposition of local Q-functions using coupling
Approximate least square policy iteration algorithm
Graphical conditions for better sample efficiency
🔎 Similar Papers
No similar papers found.
S
S. P. Q. Syed
Mechanical and Aerospace Engineering, Oklahoma State University, USA
He Bai
He Bai
Oklahoma State University
controlroboticsestimationplanning