🤖 AI Summary
To address the low computational efficiency and poor scalability of existing approaches for large-scale Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) under deterministic dynamics—such as multi-robot navigation—this paper introduces Det-Dec-POMDP, a novel model that explicitly encodes deterministic state transitions and observations. Building upon this formulation, we propose Iterative Deterministic POMDP Planning (IDPP), an algorithm that tightly integrates joint equilibrium policy search with deterministic action-observation modeling, augmented by efficient pruning and policy iteration mechanisms. Experimental results demonstrate that IDPP significantly outperforms state-of-the-art methods on large-scale Det-Dec-POMDP benchmarks: it achieves order-of-magnitude speedups in multi-robot path planning while yielding higher-quality coordination. Notably, IDPP is the first method to enable real-time collaborative planning for systems comprising up to one hundred agents.
📝 Abstract
Many high-level multi-agent planning problems, including multi-robot navigation and path planning, can be effectively modeled using deterministic actions and observations.
In this work, we focus on such domains and introduce the class of Deterministic Decentralized POMDPs (Det-Dec-POMDPs). This is a subclass of Dec-POMDPs characterized by deterministic transitions and observations conditioned on the state and joint actions.
We then propose a practical solver called Iterative Deterministic POMDP Planning (IDPP). This method builds on the classic Joint Equilibrium Search for Policies framework and is specifically optimized to handle large-scale Det-Dec-POMDPs that current Dec-POMDP solvers are unable to address efficiently.