🤖 AI Summary
This work addresses the combinatorial explosion and high computational overhead inherent in task allocation and path planning for heterogeneous robotic teams in extraterrestrial exploration. To tackle these challenges, the authors propose a cooperative planning framework based on Multi-Agent Proximal Policy Optimization (MAPPO), which integrates reinforcement learning into the joint optimization of tasks and trajectories. By shifting the computational burden to offline training, the approach enables efficient online replanning in dynamic environments. Experimental results in planetary exploration simulations demonstrate that the method effectively approximates single-objective optimal solutions, substantially reduces planning latency, and exhibits strong scalability and real-time performance, making it well-suited for large-scale heterogeneous robotic teams.
📝 Abstract
Efficient robotic extraterrestrial exploration requires robots with diverse capabilities, ranging from scientific measurement tools to advanced locomotion. A robotic team enables the distribution of tasks over multiple specialized subsystems, each providing specific expertise to complete the mission. The central challenge lies in efficiently coordinating the team to maximize utilization and the extraction of scientific value. Classical planning algorithms scale poorly with problem size, leading to long planning cycles and high inference costs due to the combinatorial growth of possible robot-target allocations and possible trajectories. Learning-based methods are a viable alternative that move the scaling concern from runtime to training time, setting a critical step towards achieving real-time planning. In this work, we present a collaborative planning strategy based on Multi-Agent Proximal Policy Optimization (MAPPO) to coordinate a team of heterogeneous robots to solve a complex target allocation and scheduling problem. We benchmark our approach against single-objective optimal solutions obtained through exhaustive search and evaluate its ability to perform online replanning in the context of a planetary exploration scenario.