Sure-almost-sure and Sure-limit-sure Window Mean Payoff in Markov Decision Processes

📅 2026-05-12

📈 Citations: 0

✨ Influential: 0

career value

249K/year

🤖 AI Summary

This work addresses the synthesis problem for Markov decision processes under window mean-payoff objectives, requiring strategies to satisfy two simultaneous constraints: (i) all trajectories must achieve at least a minimal payoff threshold (sure constraint), and (ii) a higher threshold must be met with probability one (almost-sure) or with arbitrarily high probability (limit-sure). The paper provides the first complete solution to the combined sure–almost-sure and sure–limit-sure problems for both fixed-window and bounded-window variants. By integrating automata-theoretic and game-theoretic techniques, it establishes that the fixed-window case (with window length encoded in unary) lies in P, while the bounded-window case resides in NP ∩ coNP. Tight upper and lower bounds on the memory required by winning strategies are also derived, matching the known complexity bounds for single-objective settings.

📝 Abstract

Given rationals $α$ and $β$, the sure-almost-sure problem for a quantitative objective $\varphi$ in a Markov decision process (MDP) asks if one can simultaneously ensure that all outcomes of the MDP have $\varphi$-value at least $α$ (i.e. sure $α$ satisfaction) and with probability $1$ the outcome has $\varphi$-value at least $β$ (i.e. almost-sure $β$ satisfaction). The sure-limit-sure problem asks if for all $\varepsilon > 0$ one can simultaneously ensure that all outcomes have $\varphi$-value at least $α$ and with probability at least $1 - \varepsilon$ the outcome has $\varphi$-value at least $β$. Moreover, if simultaneous satisfaction of objectives is possible, then one would also like to construct a strategy (for sure-almost-sure) or a family of strategies (for sure-limit-sure) that achieves this. In this paper, we solve the sure-almost-sure and sure-limit-sure problems for window mean-payoff objectives. The window mean-payoff objective strengthens the standard mean-payoff objective by requiring that the average payoff of a finite window that slides over an infinite run be greater than a given threshold. We study two variants of window mean payoff: in the fixed variant, the window length $\ell$ is given, while in the bounded variant, the length is not given but is required to be bounded throughout the run. We show that the sure-almost-sure problem and the sure-limit-sure problem are both in P for the fixed variant (if $\ell$ is given in unary) and are both in NP $\cap$ coNP for the bounded variant, matching the computational complexity of sure satisfaction and almost-sure satisfaction when considered separately for these objectives. We also give bounds for the memory requirement of winning strategies for all considered problems.

Problem

Research questions and friction points this paper is trying to address.

Markov Decision Processes

Window Mean Payoff

Sure-Almost-Sure

Sure-Limit-Sure

Quantitative Objectives

Innovation

Methods, ideas, or system contributions that make the work stand out.

window mean payoff

Markov decision processes

sure-almost-sure synthesis