🤖 AI Summary
Visual-perception-based autonomous systems struggle to provide rigorous safety guarantees under environmental uncertainty and distributional shift.
Method: This paper introduces the first unified probabilistic verification and validation framework for end-to-end safety analysis of perception models and decision-making policies. It innovatively integrates abstract interpretation with offline verification, constructing a perception–decision co-modeling framework based on interval Markov decision processes (IMDPs) to ensure robustness against out-of-distribution scenarios. Additionally, we design synthetic perceptual Markov chains and an enhanced Mountain Car benchmark to improve verification tightness and scalability.
Results: Experiments demonstrate that our framework yields strictly guaranteed, tight, and formally verifiable safety bounds on both synthetic perceptual chains and classical control benchmarks—significantly outperforming existing abstract verification approaches in both precision and scalability.
📝 Abstract
Precise and comprehensive situational awareness is a critical capability of modern autonomous systems. Deep neural networks that perceive task-critical details from rich sensory signals have become ubiquitous; however, their black-box behavior and sensitivity to environmental uncertainty and distribution shifts make them challenging to verify formally. Abstraction-based verification techniques for vision-based autonomy produce safety guarantees contingent on rigid assumptions, such as bounded errors or known unique distributions. Such overly restrictive and inflexible assumptions limit the validity of the guarantees, especially in diverse and uncertain test-time environments. We propose a methodology that unifies the verification models of perception with their offline validation. Our methodology leverages interval MDPs and provides a flexible end-to-end guarantee that adapts directly to the out-of-distribution test-time conditions. We evaluate our methodology on a synthetic perception Markov chain with well-defined state estimation distributions and a mountain car benchmark. Our findings reveal that we can guarantee tight yet rigorous bounds on overall system safety.