RAILGUN: A Unified Convolutional Policy for Multi-Agent Path Finding Across Different Environments and Tasks

📅 2025-03-04

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

Multi-agent pathfinding (MAPF) faces challenges stemming from the dominance of decentralized modeling and poor generalization capability. This paper introduces RAILGUN, the first centralized learning framework for MAPF: it takes the entire map as input and employs a CNN-based architecture to produce global action assignments end-to-end, eliminating explicit dependence on agent count or map size. The method adopts a supervised learning paradigm, training a unified policy network on trajectory datasets generated via rule-based planners. RAILGUN achieves zero-shot generalization across unseen maps, tasks, and agent counts—the first such result in MAPF. It significantly outperforms state-of-the-art decentralized baselines across multiple benchmarks, empirically validating both the feasibility and superiority of centralized learning for MAPF.

Technology Category

Application Category

📝 Abstract

Multi-Agent Path Finding (MAPF), which focuses on finding collision-free paths for multiple robots, is crucial for applications ranging from aerial swarms to warehouse automation. Solving MAPF is NP-hard so learning-based approaches for MAPF have gained attention, particularly those leveraging deep neural networks. Nonetheless, despite the community's continued efforts, all learning-based MAPF planners still rely on decentralized planning due to variability in the number of agents and map sizes. We have developed the first centralized learning-based policy for MAPF problem called RAILGUN. RAILGUN is not an agent-based policy but a map-based policy. By leveraging a CNN-based architecture, RAILGUN can generalize across different maps and handle any number of agents. We collect trajectories from rule-based methods to train our model in a supervised way. In experiments, RAILGUN outperforms most baseline methods and demonstrates great zero-shot generalization capabilities on various tasks, maps and agent numbers that were not seen in the training dataset.

Problem

Research questions and friction points this paper is trying to address.

Develops centralized learning-based policy for MAPF.

Generalizes across maps and handles varying agent numbers.

Outperforms baselines with zero-shot generalization capabilities.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Centralized CNN-based policy for MAPF

Generalizes across maps and agent numbers

Trained using rule-based method trajectories

🔎 Similar Papers

No similar papers found.