Physics-Informed Multi-Agent Reinforcement Learning for Distributed Multi-Robot Problems

📅 2023-12-30
🏛️ arXiv.org
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
To address the scalability limitations of centralized policies and the lack of coordination in independent policies for multi-robot systems, this paper proposes a scalable and physically consistent distributed reinforcement learning framework. Methodologically, it pioneers the integration of port-Hamiltonian system modeling with graph self-attention mechanisms, enabling efficient local information utilization under dynamic interaction graphs while enforcing energy conservation; it employs distributed policy parameterization—bypassing value function decomposition—and explicitly encodes inter-robot dependencies. Key contributions include: (1) a self-attention–enhanced port-Hamiltonian policy representation; (2) zero-shot post-training scalability to robot swarms six times larger than those used during training; and (3) cumulative rewards twice those of prior state-of-the-art methods across multiple tasks, demonstrating significant improvements in scalability and collaborative efficiency.

Technology Category

Application Category

📝 Abstract
The networked nature of multi-robot systems presents challenges in the context of multi-agent reinforcement learning. Centralized control policies do not scale with increasing numbers of robots, whereas independent control policies do not exploit the information provided by other robots, exhibiting poor performance in cooperative-competitive tasks. In this work we propose a physics-informed reinforcement learning approach able to learn distributed multi-robot control policies that are both scalable and make use of all the available information to each robot. Our approach has three key characteristics. First, it imposes a port-Hamiltonian structure on the policy representation, respecting energy conservation properties of physical robot systems and the networked nature of robot team interactions. Second, it uses self-attention to ensure a sparse policy representation able to handle time-varying information at each robot from the interaction graph. Third, we present a soft actor-critic reinforcement learning algorithm parameterized by our self-attention port-Hamiltonian control policy, which accounts for the correlation among robots during training while overcoming the need of value function factorization. Extensive simulations in different multi-robot scenarios demonstrate the success of the proposed approach, surpassing previous multi-robot reinforcement learning solutions in scalability, while achieving similar or superior performance (with averaged cumulative reward up to x2 greater than the state-of-the-art with robot teams x6 larger than the number of robots at training time).
Problem

Research questions and friction points this paper is trying to address.

Scalable distributed control for multi-robot systems
Cooperative-competitive task performance improvement
Physics-informed policy respecting energy conservation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Physics-informed reinforcement learning for distributed control
Port-Hamiltonian policy structure ensures energy conservation
Self-attention enables sparse, scalable policy representation
🔎 Similar Papers
No similar papers found.
Eduardo Sebastián
Eduardo Sebastián
University of Cambridge
RoboticsNetworked SystemsControlLearning
T
T. Duong
Department of Electrical and Computer Engineering, University of California San Diego, La Jolla, CA 92093 USA
N
Nikolay Atanasov
Department of Electrical and Computer Engineering, University of California San Diego, La Jolla, CA 92093 USA
E
E. Montijano
DIIS - I3A, Universidad de Zaragoza, Spain
C
C. Sagüés
DIIS - I3A, Universidad de Zaragoza, Spain