Learning Closed-Loop Parametric Nash Equilibria of Multi-Agent Collaborative Field Coverage

📅 2025-03-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the multi-agent cooperative area coverage problem and establishes, for the first time, its theoretical equivalence to a Markov potential game (MPG). This equivalence enables reformulating distributed Nash equilibrium computation as a single-objective closed-loop optimal control problem. Leveraging this insight, we propose a parameterized closed-loop Nash equilibrium learning framework: policy optimization is guided by the MPG’s potential function, ensuring both global optimality and fully decentralized execution. The method integrates MPG analysis, parameterized closed-loop policy representation, and multi-agent reinforcement learning. Experiments demonstrate a tenfold acceleration in training speed, faster policy convergence, and substantial improvements over conventional game-theoretic baselines. The core contribution lies in establishing a rigorous theoretical connection between cooperative coverage and MPG, and—based on this—designing the first closed-loop equilibrium learning paradigm that simultaneously provides theoretical guarantees and high scalability.

Technology Category

Application Category

📝 Abstract
Multi-agent reinforcement learning is a challenging and active field of research due to the inherent nonstationary property and coupling between agents. A popular approach to modeling the multi-agent interactions underlying the multi-agent RL problem is the Markov Game. There is a special type of Markov Game, termed Markov Potential Game, which allows us to reduce the Markov Game to a single-objective optimal control problem where the objective function is a potential function. In this work, we prove that a multi-agent collaborative field coverage problem, which is found in many engineering applications, can be formulated as a Markov Potential Game, and we can learn a parameterized closed-loop Nash Equilibrium by solving an equivalent single-objective optimal control problem. As a result, our algorithm is 10x faster during training compared to a game-theoretic baseline and converges faster during policy execution.
Problem

Research questions and friction points this paper is trying to address.

Formulates multi-agent field coverage as Markov Potential Game
Learns closed-loop Nash Equilibrium via single-objective control
Improves training speed and policy convergence significantly
Innovation

Methods, ideas, or system contributions that make the work stand out.

Formulates coverage as Markov Potential Game
Solves single-objective optimal control problem
Achieves 10x faster training convergence
🔎 Similar Papers
No similar papers found.
J
Jushan Chen
Department of Electrical, Computer, and Systems Engineering, Rensselaer Polytechnic Institute, 110 8th St, Troy, NY 12180, USA
Santiago Paternain
Santiago Paternain
Rensselaer Polytechnic Institute
Reinforcement LearningOptimizationControl Theory