🤖 AI Summary
To address myopic decision-making and inefficient coordination arising from local observations in distributed multi-agent path planning under high-density obstacle environments, this work pioneers the integration of sheaf theory into multi-agent reinforcement learning, establishing a geometric dependency modeling framework grounded in local consensus to theoretically guarantee globally collision-free coordination. Methodologically, we unify self-supervised latent-space consensus learning, geometry-aware graph neural networks, and distributed deep reinforcement learning to enable implicit consensus among agents on latent motion patterns. Extensive large-scale simulations and real-robot experiments demonstrate that our approach reduces path length by 23%, improves collaborative success rate by 37%, and decreases communication overhead by 41% compared to state-of-the-art methods.
📝 Abstract
The Multi-Agent Path Finding (MAPF) problem aims to determine the shortest and collision-free paths for multiple agents in a known, potentially obstacle-ridden environment. It is the core challenge for robotic deployments in large-scale logistics and transportation. Decentralized learning-based approaches have shown great potential for addressing the MAPF problems, offering more reactive and scalable solutions. However, existing learning-based MAPF methods usually rely on agents making decisions based on a limited field of view (FOV), resulting in short-sighted policies and inefficient cooperation in complex scenarios. There, a critical challenge is to achieve consensus on potential movements between agents based on limited observations and communications. To tackle this challenge, we introduce a new framework that applies sheaf theory to decentralized deep reinforcement learning, enabling agents to learn geometric cross-dependencies between each other through local consensus and utilize them for tightly cooperative decision-making. In particular, sheaf theory provides a mathematical proof of conditions for achieving global consensus through local observation. Inspired by this, we incorporate a neural network to approximately model the consensus in latent space based on sheaf theory and train it through self-supervised learning. During the task, in addition to normal features for MAPF as in previous works, each agent distributedly reasons about a learned consensus feature, leading to efficient cooperation on pathfinding and collision avoidance. As a result, our proposed method demonstrates significant improvements over state-of-the-art learning-based MAPF planners, especially in relatively large and complex scenarios, demonstrating its superiority over baselines in various simulations and real-world robot experiments.