Learning to Communicate Locally for Large-Scale Multi-Agent Pathfinding

📅 2026-05-08

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

This work addresses the suboptimality and scalability limitations in large-scale multi-agent path finding (MAPF) stemming from insufficient coordination by proposing LC-MAPF, a novel framework that formulates MAPF as a decentralized partially observable Markov decision process with a learnable local communication mechanism. Integrating imitation learning and reinforcement learning, LC-MAPF employs graph neural networks to iteratively aggregate neighborhood information through multiple rounds of local message passing and generate coordinated actions. Experimental results demonstrate that LC-MAPF significantly outperforms existing learning-based solvers across diverse unseen scenarios, achieving higher success rates and superior path quality while maintaining strong scalability.

📝 Abstract

Multi-agent pathfinding (MAPF) is a widely used abstraction for multi-robot trajectory planning problems, where multiple homogeneous agents move simultaneously within a shared environment. Although solving MAPF optimally is NP-hard, scalable and efficient solvers are critical for real-world applications such as logistics and search-and-rescue. To this end, the research community has proposed various decentralized suboptimal MAPF solvers that leverage machine learning. Such methods frame MAPF (from a single agent perspective) as a Dec-POMDP where at each time step an agent has to decide an action based on the local observation and typically solve the problem via reinforcement learning or imitation learning. We follow the same approach but additionally introduce a learnable communication module tailored to enhance cooperation between agents via efficient feature sharing. We present the Local Communication for Multi-agent Pathfinding (LC-MAPF), a generalizable pre-trained model that applies multi-round communication between neighboring agents to exchange information and improve their coordination. Our experiments show that the introduced method outperforms the existing learning-based MAPF solvers, including IL and RL-based approaches, across diverse metrics in a diverse range of (unseen) test scenarios. Remarkably, the introduced communication mechanism does not compromise LC-MAPF's scalability, a common bottleneck for communication-based MAPF solvers.

Problem

Research questions and friction points this paper is trying to address.

multi-agent pathfinding

decentralized coordination

learnable communication

scalability

local observation

Innovation

Methods, ideas, or system contributions that make the work stand out.

learnable communication

multi-agent pathfinding

local observation