🤖 AI Summary
This paper addresses the computation of the least fixed point of monotone nonexpansive operators over the nonnegative reals—a problem arising in quantitative semantics, MDP-based reinforcement learning, and probabilistic system verification. We propose a damped Mann iteration method, the first to apply damped Mann iteration to least fixed-point approximation, thereby overcoming the limitation of standard iterations that may converge to non-least fixed points. Crucially, our method guarantees almost-sure convergence to the least fixed point even when the operator is only accessible via asymptotic approximations—e.g., sampling-based estimates with probabilistic error bounds. Theoretical analysis integrates monotone operator theory with properties of nonexpansive mappings. Experimental evaluation demonstrates convergence to the optimal expected reward in MDPs. To our knowledge, this work provides the first framework for least fixed-point computation that is simultaneously robust to imprecision, provably convergent, and practically implementable in sampling-driven probabilistic systems.
📝 Abstract
Fixpoints are ubiquitous in computer science and when dealing with quantitative semantics and verification one is commonly led to consider least fixpoints of (higher-dimensional) functions over the nonnegative reals. We show how to approximate the least fixpoint of such functions, focusing on the case in which they are not known precisely, but represented by a sequence of approximating functions that converge to them. We concentrate on monotone and non-expansive functions, for which uniqueness of fixpoints is not guaranteed and standard fixpoint iteration schemes might get stuck at a fixpoint that is not the least. Our main contribution is the identification of an iteration scheme, a variation of Mann iteration with a dampening factor, which, under suitable conditions, is shown to guarantee convergence to the least fixpoint of the function of interest. We then argue that these results are relevant in the context of model-based reinforcement learning for Markov decision processes (MDPs), showing that the proposed iteration scheme instantiates to MDPs and allows us to derive convergence to the optimal expected return. More generally, we show that our results can be used to iterate to the least fixpoint almost surely for systems where the function of interest can be approximated with given probabilistic error bounds, as it happens for probabilistic systems, such as simple stochastic games, that can be explored via sampling.