🤖 AI Summary
This work addresses the challenge in learning-based control where goal-oriented policies often struggle to simultaneously ensure safety and closed-loop stability. To this end, the paper proposes a Predictive Safety–Stability Filter (PS²F) that employs a cascaded two-layer model predictive control (MPC) architecture: an upper-layer nominal MPC generates certified trajectories, while a lower-layer filter dynamically modifies external commands in real time to keep the system within a safe and asymptotically stable region. The approach uniquely provides, within a unified framework, guarantees of both safety and asymptotic stability without undue conservatism and enables smooth transitions between exploration and exploitation. Theoretical analysis establishes recursive feasibility and stability of the closed-loop system, and numerical experiments demonstrate its effectiveness and superiority across multiple scenarios.
📝 Abstract
Ensuring both safety and stability remains a fundamental challenge in learning-based control, where goal-oriented policies often neglect system constraints and closed-loop state convergence. To address this limitation, this paper introduces the Predictive Safety--Stability Filter (PS2F), a unified predictive filter framework that guarantees constraint satisfaction and asymptotic stability within a single architecture. The PS2F framework comprises two cascaded optimal control problems: a nominal model predictive control (MPC) layer that serves solely as a copilot, implicitly defining a Lyapunov function and generating safety- and stability-certified predicted trajectories, and a secondary filtering layer that adjusts external command to remain within a provably safe and stable region. This cascaded structure enables PS2F to inherit the theoretical guarantees of nominal MPC while accommodating goal-oriented external commands. Rigorous analysis establishes recursive feasibility and asymptotic stability of the closed-loop system without introducing additional conservatism beyond that associated with the nominal MPC. Furthermore, a time-varying parameterisation allows PS2F to transition smoothly between safety-prioritised and stability-oriented operation modes, providing a principled mechanism for balancing exploration and exploitation. The effectiveness of the proposed framework is demonstrated through comparative numerical experiments.