Stochastic Window Mean-payoff Games

📅 2023-04-23

🏛️ Foundations of Software Science and Computation Structure

📈 Citations: 1

✨ Influential: 0

career value

225K/year

🤖 AI Summary

This paper investigates strategy synthesis for two-player stochastic games with sliding-window average payoff objectives: synthesizing Player 1’s strategies that guarantee, against arbitrary Player 2 strategies and stochastic transitions, that the average payoff over a finite sliding window satisfies a given threshold with positive probability, almost surely, or with probability at least (p). We consider both variants where the window length is fixed and known, and where it is bounded but unknown. Our key methodological contribution is the first general reduction framework from stochastic to non-stochastic games, uniformly characterizing reduction conditions for prefix-independent objectives. We establish tight memory complexity bounds—both upper and lower—for both players under all probabilistic guarantees. Crucially, the memory requirements match those of the corresponding non-stochastic game setting. Furthermore, we provide exact polynomial-time algorithms and delineate precise computational complexity boundaries for all variants.

📝 Abstract

Stochastic two-player games model systems with an environment that is both adversarial and stochastic. The adversarial part of the environment is modeled by a player (Player 2) who tries to prevent the system (Player 1) from achieving its objective. We consider finitary versions of the traditional mean-payoff objective, replacing the long-run average of the payoffs by payoff average computed over a finite sliding window. Two variants have been considered: in one variant, the maximum window length is fixed and given, while in the other, it is not fixed but is required to be bounded. For both variants, we present complexity bounds and algorithmic solutions for computing strategies for Player 1 to ensure that the objective is satisfied with positive probability, with probability 1, or with probability at least $p$, regardless of the strategy of Player 2. The solution crucially relies on a reduction to the special case of non-stochastic two-player games. We give a general characterization of prefix-independent objectives for which this reduction holds. The memory requirement for both players in stochastic games is also the same as in non-stochastic games by our reduction. Moreover, for non-stochastic games, we improve upon the upper bound for the memory requirement of Player 1 and upon the lower bound for the memory requirement of Player 2.

Problem

Research questions and friction points this paper is trying to address.

Analyzing finitary mean-payoff objectives in stochastic adversarial games

Computing strategies for probabilistic satisfaction under sliding window constraints

Establishing complexity bounds and memory requirements via reduction techniques

Innovation

Methods, ideas, or system contributions that make the work stand out.

Sliding window finite average payoff computation

Reduction to non-stochastic games strategy

Improved memory bounds both players

🔎 Similar Papers

Boosting Perturbed Gradient Ascent for Last-Iterate Convergence in Games