The Value Problem for Multiple-Environment MDPs with Parity Objective

📅 2025-04-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper investigates the worst-case value problem for parity objectives in multi-environment Markov decision processes (MEMDPs): determining whether a strategy exists that ensures the infimum probability of satisfying the parity objective across all environments equals one, and computing the supremum of this infimum. The authors establish that deciding whether the value equals one is PSPACE-complete in general, but becomes polynomial-time solvable when the number of environments is fixed. They prove that pure strategies suffice for almost-sure satisfaction, whereas general approximation requires randomization. Moreover, they devise the first double-exponential-space approximation algorithm for computing the worst-case value. This work uncovers a fundamental decidability gap between MEMDPs and partially observable MDPs (POMDPs), establishes an efficient decision framework for MEMDPs with a bounded number of environments, and provides the first algorithm for approximating the value of parity objectives in MEMDPs.

Technology Category

Application Category

📝 Abstract
We consider multiple-environment Markov decision processes (MEMDP), which consist of a finite set of MDPs over the same state space, representing different scenarios of transition structure and probability. The value of a strategy is the probability to satisfy the objective, here a parity objective, in the worst-case scenario, and the value of an MEMDP is the supremum of the values achievable by a strategy. We show that deciding whether the value is 1 is a PSPACE-complete problem, and even in P when the number of environments is fixed, along with new insights to the almost-sure winning problem, which is to decide if there exists a strategy with value 1. Pure strategies are sufficient for theses problems, whereas randomization is necessary in general when the value is smaller than 1. We present an algorithm to approximate the value, running in double exponential space. Our results are in contrast to the related model of partially-observable MDPs where all these problems are known to be undecidable.
Problem

Research questions and friction points this paper is trying to address.

Determine MEMDP value for worst-case parity objectives
Decide PSPACE-completeness of value-1 problem
Compare MEMDP decidability with partially-observable MDPs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multiple-environment MDPs with parity objective
PSPACE-complete value decision problem
Double exponential space approximation algorithm
🔎 Similar Papers
No similar papers found.
Krishnendu Chatterjee
Krishnendu Chatterjee
Professor, IST Austria
Game theoryLogic and automata theoryAlgorithmsEvolutionary Game theoryAlgorithmic Game Theory
L
L. Doyen
CNRS & LMF, ENS Paris-Saclay, France
Jean-François Raskin
Jean-François Raskin
Université Libre de Bruxelles, U.L.B.
Computer Science
O
O. Sankur
Université de Rennes, CNRS, Inria, France & Mitsubishi Electric R&D Centre Europe, France