Exploiting inter-agent coupling information for efficient reinforcement learning of cooperative LQR

📅 2025-04-29

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

This work addresses the low sample and computational efficiency of multi-agent reinforcement learning (MARL) in cooperative linear quadratic regulator (LQR) tasks. We propose a graph-structured, exact decomposition of local Q-functions that explicitly models inter-agent coupling via the underlying interaction graph. To our knowledge, this is the first method to achieve structured modeling of coupling information, with a rigorous theoretical derivation establishing sample-complexity optimality and providing necessary and sufficient graph-theoretic conditions for improved sampling efficiency. Technically, the approach integrates approximate least-squares policy iteration, graph-driven Q-function decomposition, two distributed neural network architectures, and a structured learning mechanism. Experiments demonstrate that the algorithm achieves the centralized performance lower bound even in worst-case settings, while significantly improving both sample and computational efficiency—thereby empirically validating the fundamental gains afforded by explicit coupling structure exploitation.

Technology Category

Application Category

📝 Abstract

Developing scalable and efficient reinforcement learning algorithms for cooperative multi-agent control has received significant attention over the past years. Existing literature has proposed inexact decompositions of local Q-functions based on empirical information structures between the agents. In this paper, we exploit inter-agent coupling information and propose a systematic approach to exactly decompose the local Q-function of each agent. We develop an approximate least square policy iteration algorithm based on the proposed decomposition and identify two architectures to learn the local Q-function for each agent. We establish that the worst-case sample complexity of the decomposition is equal to the centralized case and derive necessary and sufficient graphical conditions on the inter-agent couplings to achieve better sample efficiency. We demonstrate the improved sample efficiency and computational efficiency on numerical examples.

Problem

Research questions and friction points this paper is trying to address.

Develop efficient reinforcement learning for multi-agent control

Exploit inter-agent coupling to decompose local Q-functions

Improve sample and computational efficiency in cooperative LQR

Innovation

Methods, ideas, or system contributions that make the work stand out.

Exact decomposition of local Q-functions using coupling

Approximate least square policy iteration algorithm

Graphical conditions for better sample efficiency

🔎 Similar Papers

No similar papers found.