🤖 AI Summary
Existing neural Lyapunov-barrier certificates struggle to guarantee safety and stability when system dynamics are subject to perturbations. This work formally defines, for the first time, a notion of certifiably robust neural Lyapunov-barrier functions against dynamic perturbations and establishes sufficient conditions for such robustness. By leveraging Lipschitz continuity, the proposed approach jointly optimizes adversarial training, Lipschitz neighborhood constraints, and global Lipschitz regularization. Evaluated on inverted pendulum and 2D docking tasks, the method significantly outperforms baseline approaches, achieving up to a 4.6× improvement in certified robustness margins and up to a 2.4× increase in empirical success rates under strong perturbations.
📝 Abstract
Neural Lyapunov and barrier certificates have recently been used as powerful tools for verifying the safety and stability properties of deep reinforcement learning (RL) controllers. However, existing methods offer guarantees only under fixed ideal unperturbed dynamics, limiting their reliability in real-world applications where dynamics may deviate due to uncertainties. In this work, we study the problem of synthesizing \emph{robust neural Lyapunov barrier certificates} that maintain their guarantees under perturbations in system dynamics. We formally define a robust Lyapunov barrier function and specify sufficient conditions based on Lipschitz continuity that ensure robustness against bounded perturbations. We propose practical training objectives that enforce these conditions via adversarial training, Lipschitz neighborhood bound, and global Lipschitz regularization. We validate our approach in two practically relevant environments, Inverted Pendulum and 2D Docking. The former is a widely studied benchmark, while the latter is a safety-critical task in autonomous systems. We show that our methods significantly improve both certified robustness bounds (up to $4.6$ times) and empirical success rates under strong perturbations (up to $2.4$ times) compared to the baseline. Our results demonstrate effectiveness of training robust neural certificates for safe RL under perturbations in dynamics.