🤖 AI Summary
Conventional Lyapunov-based methods struggle to verify incremental input-to-state stability (δISS) in reinforcement learning (RL) due to the inherent non-smoothness and unboundedness of value functions.
Method: This paper introduces a novel analytical framework that employs the reward function—as opposed to a Lyapunov function—as the test function. By constructing reward functions exhibiting exponential-decay Lipschitz continuity and adversarial Hölder continuity, we establish a rigorous equivalence between δISS and RL-style value function regularity.
Contribution/Results: Our approach overcomes classical Lyapunov requirements—namely smoothness, boundedness, and positive definiteness—enabling δISS certification under bidirectionally unbounded and non-smooth rewards. Experiments demonstrate that δISS can be verified directly on policy-closed-loop systems without explicit Lyapunov construction, offering a new paradigm for stability analysis of learning-based control systems.
📝 Abstract
This paper presents a novel framework for analyzing Incremental-Input-to-State Stability ($δ$ISS) based on the idea of using rewards as "test functions." Whereas control theory traditionally deals with Lyapunov functions that satisfy a time-decrease condition, reinforcement learning (RL) value functions are constructed by exponentially decaying a Lipschitz reward function that may be non-smooth and unbounded on both sides. Thus, these RL-style value functions cannot be directly understood as Lyapunov certificates. We develop a new equivalence between a variant of incremental input-to-state stability of a closed-loop system under given a policy, and the regularity of RL-style value functions under adversarial selection of a Hölder-continuous reward function. This result highlights that the regularity of value functions, and their connection to incremental stability, can be understood in a way that is distinct from the traditional Lyapunov-based approach to certifying stability in control theory.