🤖 AI Summary
Distributed Model Predictive Control (DMPC) for large-scale multi-robot systems suffers from poor real-time performance, high computational complexity, and difficulties in guaranteeing both cooperative stability and collision-free safety. Method: This paper proposes a Distributed Learning-based Predictive Control (DLPC) framework that abandons conventional numerical optimization solvers for closed-loop policy generation. Instead, it introduces an incremental, online, distributed Actor-Critic policy learning mechanism, enabling explicit, millisecond-level closed-loop policy deployment. Additionally, it incorporates a force-field-inspired safety constraint modeling to enhance collision-avoidance robustness. Contribution/Results: Evaluated in simulations with up to 10,000 robots, DLPC achieves policy deployment latency under 10 ms, demonstrates strong scalability and cross-scale transferability, and provides the first end-to-end learning-based DMPC solution for ultra-large-scale swarms that eliminates reliance on numerical solvers.
📝 Abstract
Distributed model predictive control (DMPC) is promising in achieving optimal cooperative control in multirobot systems (MRS). However, real-time DMPC implementation relies on numerical optimization tools to periodically calculate local control sequences online. This process is computationally demanding and lacks scalability for large-scale, nonlinear MRS. This article proposes a novel distributed learning-based predictive control (DLPC) framework for scalable multirobot control. Unlike conventional DMPC methods that calculate open-loop control sequences, our approach centers around a computationally fast and efficient distributed policy learning algorithm that generates explicit closed-loop DMPC policies for MRS without using numerical solvers. The policy learning is executed incrementally and forward in time in each prediction interval through an online distributed actor-critic implementation. The control policies are successively updated in a receding-horizon manner, enabling fast and efficient policy learning with the closed-loop stability guarantee. The learned control policies could be deployed online to MRS with varying robot scales, enhancing scalability and transferability for large-scale MRS. Furthermore, we extend our methodology to address the multirobot safe learning challenge through a force field-inspired policy learning approach. We validate our approach's effectiveness, scalability, and efficiency through extensive experiments on cooperative tasks of large-scale wheeled robots and multirotor drones. Our results demonstrate the rapid learning and deployment of DMPC policies for MRS with scales up to 10,000 units.