Sequence Pathfinder for Multi-Agent Pickup and Delivery in the Warehouse

📅 2025-09-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of limited local observations, high explicit communication overhead, and low global coordination efficiency in Multi-Agent Pickup and Delivery (MAPD) tasks within narrow-aisle environments (e.g., warehouses), this paper proposes a sequence-based implicit coordination path planning framework. It jointly encodes multi-agent trajectories into a permutation-invariant sequence and leverages Transformer architectures to enable implicit information exchange without explicit inter-agent communication. We theoretically establish that the resulting sequential policy possesses order-invariance optimality, reducing decision complexity from exponential to linear in the number of agents. Integrating distributed execution with imitation learning ensures both real-time responsiveness and strong generalization. Experiments demonstrate that our method significantly outperforms existing learning-based approaches across multiple MAPF benchmarks and their variants, while maintaining robust high performance in unseen complex environments—thereby enhancing both coordination efficiency and global situational awareness.

Technology Category

Application Category

📝 Abstract
Multi-Agent Pickup and Delivery (MAPD) is a challenging extension of Multi-Agent Path Finding (MAPF), where agents are required to sequentially complete tasks with fixed-location pickup and delivery demands. Although learning-based methods have made progress in MAPD, they often perform poorly in warehouse-like environments with narrow pathways and long corridors when relying only on local observations for distributed decision-making. Communication learning can alleviate the lack of global information but introduce high computational complexity due to point-to-point communication. To address this challenge, we formulate MAPF as a sequence modeling problem and prove that path-finding policies under sequence modeling possess order-invariant optimality, ensuring its effectiveness in MAPD. Building on this, we propose the Sequential Pathfinder (SePar), which leverages the Transformer paradigm to achieve implicit information exchange, reducing decision-making complexity from exponential to linear while maintaining efficiency and global awareness. Experiments demonstrate that SePar consistently outperforms existing learning-based methods across various MAPF tasks and their variants, and generalizes well to unseen environments. Furthermore, we highlight the necessity of integrating imitation learning in complex maps like warehouses.
Problem

Research questions and friction points this paper is trying to address.

Addresses poor performance in warehouse MAPD with narrow pathways
Reduces computational complexity from exponential to linear
Ensures effective path-finding with order-invariant optimality properties
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Transformer for implicit agent communication
Models pathfinding as sequence with order-invariant optimality
Reduces complexity from exponential to linear
🔎 Similar Papers
No similar papers found.