π€ AI Summary
Multi-agent pathfinding (MAPF) faces challenges stemming from the dominance of decentralized modeling and poor generalization capability. This paper introduces RAILGUN, the first centralized learning framework for MAPF: it takes the entire map as input and employs a CNN-based architecture to produce global action assignments end-to-end, eliminating explicit dependence on agent count or map size. The method adopts a supervised learning paradigm, training a unified policy network on trajectory datasets generated via rule-based planners. RAILGUN achieves zero-shot generalization across unseen maps, tasks, and agent countsβthe first such result in MAPF. It significantly outperforms state-of-the-art decentralized baselines across multiple benchmarks, empirically validating both the feasibility and superiority of centralized learning for MAPF.
π Abstract
Multi-Agent Path Finding (MAPF), which focuses on finding collision-free paths for multiple robots, is crucial for applications ranging from aerial swarms to warehouse automation. Solving MAPF is NP-hard so learning-based approaches for MAPF have gained attention, particularly those leveraging deep neural networks. Nonetheless, despite the community's continued efforts, all learning-based MAPF planners still rely on decentralized planning due to variability in the number of agents and map sizes. We have developed the first centralized learning-based policy for MAPF problem called RAILGUN. RAILGUN is not an agent-based policy but a map-based policy. By leveraging a CNN-based architecture, RAILGUN can generalize across different maps and handle any number of agents. We collect trajectories from rule-based methods to train our model in a supervised way. In experiments, RAILGUN outperforms most baseline methods and demonstrates great zero-shot generalization capabilities on various tasks, maps and agent numbers that were not seen in the training dataset.