🤖 AI Summary
This work addresses the reach-avoid verification problem for neural network policies in discrete-time stochastic systems. We propose a novel verification framework based on logarithmic reach-avoid supermartingales (logRASMs), introducing the first logRASM structure—which substantially reduces the theoretical Lipschitz constant. To enhance scalability and efficiency, we design a fast algorithm for computing tight Lipschitz upper bounds under weighted norms. Our approach unifies neural network certificate learning, stochastic supermartingale theory, and stochastic reach-avoid analysis to deliver high-confidence safety guarantees (e.g., 99.9999%). Evaluated on multiple benchmark tasks, our method consistently achieves high-probability verification thresholds—exceeding those attainable by existing approaches—and thereby overcomes key bottlenecks in high-reliability formal verification of stochastic neural control policies.
📝 Abstract
We consider the verification of neural network policies for discrete-time stochastic systems with respect to reach-avoid specifications. We use a learner-verifier procedure that learns a certificate for the specification, represented as a neural network. Verifying that this neural network certificate is a so-called reach-avoid supermartingale (RASM) proves the satisfaction of a reach-avoid specification. Existing approaches for such a verification task rely on computed Lipschitz constants of neural networks. These approaches struggle with large Lipschitz constants, especially for reach-avoid specifications with high threshold probabilities. We present two key contributions to obtain smaller Lipschitz constants than existing approaches. First, we introduce logarithmic RASMs (logRASMs), which take exponentially smaller values than RASMs and hence have lower theoretical Lipschitz constants. Second, we present a fast method to compute tighter upper bounds on Lipschitz constants based on weighted norms. Our empirical evaluation shows we can consistently verify the satisfaction of reach-avoid specifications with probabilities as high as 99.9999%.