Scalable Solution Methods for Dec-POMDPs with Deterministic Dynamics

📅 2025-08-29

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

To address the low computational efficiency and poor scalability of existing approaches for large-scale Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) under deterministic dynamics—such as multi-robot navigation—this paper introduces Det-Dec-POMDP, a novel model that explicitly encodes deterministic state transitions and observations. Building upon this formulation, we propose Iterative Deterministic POMDP Planning (IDPP), an algorithm that tightly integrates joint equilibrium policy search with deterministic action-observation modeling, augmented by efficient pruning and policy iteration mechanisms. Experimental results demonstrate that IDPP significantly outperforms state-of-the-art methods on large-scale Det-Dec-POMDP benchmarks: it achieves order-of-magnitude speedups in multi-robot path planning while yielding higher-quality coordination. Notably, IDPP is the first method to enable real-time collaborative planning for systems comprising up to one hundred agents.

Technology Category

Application Category

📝 Abstract

Many high-level multi-agent planning problems, including multi-robot navigation and path planning, can be effectively modeled using deterministic actions and observations. In this work, we focus on such domains and introduce the class of Deterministic Decentralized POMDPs (Det-Dec-POMDPs). This is a subclass of Dec-POMDPs characterized by deterministic transitions and observations conditioned on the state and joint actions. We then propose a practical solver called Iterative Deterministic POMDP Planning (IDPP). This method builds on the classic Joint Equilibrium Search for Policies framework and is specifically optimized to handle large-scale Det-Dec-POMDPs that current Dec-POMDP solvers are unable to address efficiently.

Problem

Research questions and friction points this paper is trying to address.

Solving large-scale deterministic decentralized POMDPs efficiently

Addressing multi-agent planning with deterministic transitions and observations

Providing scalable solutions for multi-robot navigation and path planning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces deterministic Dec-POMDP subclass Det-Dec-POMDPs

Proposes IDPP solver for large-scale deterministic Dec-POMDPs

Optimizes Joint Equilibrium Search for Policies framework

🔎 Similar Papers

No similar papers found.

Nuro

$160,360 and $240,540/ year

Mountain View, California (HQ) / California - HQ, Nuro HQ - Mountain View, CA

Master Thesis Reinforcement Learning for Behavior Planning in Automated Driving

Bosch Group

Renningen, BW, DE

Research Scientist Intern, Robotic Control Policy (PhD)