Finite-memory Strategies for Almost-sure Energy-MeanPayoff Objectives in MDPs

๐Ÿ“… 2024-04-22
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

212K/year
๐Ÿค– AI Summary
This paper investigates finite-state Markov decision processes (MDPs) with dual objectives: energy constraints and strictly positive long-run average reward. We address the challenge of synthesizing controllers that simultaneously avoid energy exhaustion and guarantee almost-sure satisfaction of a strictly positive average reward. We establish the first proof that finite-memory strategies with exponential memory suffice for almost-sure winning, and we prove this exponential memory bound is tightโ€”resolving an open question by showing that, unlike prior energy-parity objectives requiring infinite memory, this dual objective admits a finite-memory solution. Our approach integrates MDP theory, energy-game modeling, average-reward analysis, and probabilistic verification to devise a pseudo-polynomial-time decision algorithm. Furthermore, we generalize our results to multi-dimensional average rewards, providing the first pseudo-polynomial-time decidability result for this setting.

Technology Category

Application Category

๐Ÿ“ Abstract
We consider finite-state Markov decision processes with the combined Energy-MeanPayoff objective. The controller tries to avoid running out of energy while simultaneously attaining a strictly positive mean payoff in a second dimension. We show that finite memory suffices for almost surely winning strategies for the Energy-MeanPayoff objective. This is in contrast to the closely related Energy-Parity objective, where almost surely winning strategies require infinite memory in general. We show that exponential memory is sufficient (even for deterministic strategies) and necessary (even for randomized strategies) for almost surely winning Energy-MeanPayoff. The upper bound holds even if the strictly positive mean payoff part of the objective is generalized to multidimensional strictly positive mean payoff. Finally, it is decidable in pseudo-polynomial time whether an almost surely winning strategy exists.
Problem

Research questions and friction points this paper is trying to address.

Finite-memory strategies for combined Energy-MeanPayoff objectives
Avoid energy depletion while achieving positive mean payoff
Deciding existence of winning strategies in pseudo-polynomial time
Innovation

Methods, ideas, or system contributions that make the work stand out.

Finite memory suffices for winning strategies
Exponential memory is sufficient and necessary
Decidability in pseudo-polynomial time for strategy existence