🤖 AI Summary
This paper addresses the qualitative analysis of reachability and parity (ω-regular) objectives in robust Markov decision processes (RMDPs), i.e., whether objectives can be guaranteed almost surely under worst-case environmental uncertainty. Unlike prior approaches relying on structural assumptions—such as single-chain or aperiodicity—we propose, for the first time without any structural restrictions, a unified decision algorithm based on an uncertainty-set oracle. Our method integrates game-theoretic semantics, symbolic fixed-point computation, robust optimization, and ω-automata theory. The algorithm accesses uncertainty sets in a black-box manner, ensuring both theoretical soundness and engineering practicality. Evaluated on classical RMDP benchmarks with up to one thousand states, it achieves 100% correctness, significantly improving scalability and applicability over existing methods.
📝 Abstract
Robust Markov Decision Processes (RMDPs) generalize classical MDPs that consider uncertainties in transition probabilities by defining a set of possible transition functions. An objective is a set of runs (or infinite trajectories) of the RMDP, and the value for an objective is the maximal probability that the agent can guarantee against the adversarial environment. We consider (a) reachability objectives, where given a target set of states, the goal is to eventually arrive at one of them; and (b) parity objectives, which are a canonical representation for $omega$-regular objectives. The qualitative analysis problem asks whether the objective can be ensured with probability 1. In this work, we study the qualitative problem for reachability and parity objectives on RMDPs without making any assumption over the structures of the RMDPs, e.g., unichain or aperiodic. Our contributions are twofold. We first present efficient algorithms with oracle access to uncertainty sets that solve qualitative problems of reachability and parity objectives. We then report experimental results demonstrating the effectiveness of our oracle-based approach on classical RMDP examples from the literature scaling up to thousands of states.