🤖 AI Summary
Visual closed-loop control system verification faces dual challenges: the high dimensionality of images and the difficulty of environmental modeling. Existing camera surrogates based on stochastic generative models introduce latent variables, leading to over-approximation errors. This paper proposes a Deterministic World Model (DWM), the first fully deterministic state-to-image mapping architecture, eliminating conservatism induced by stochasticity. We design a control divergence loss to ensure behavioral consistency in closed-loop execution. Furthermore, we integrate StarV reachability analysis with conformal prediction to derive provably sound and tight statistical bounds on trajectory deviation. Experiments on standard benchmarks demonstrate that DWM significantly reduces reachable set volume and improves verification success rates. Crucially, its deviation bounds are tighter and more reliable than those of latent-variable baselines.
📝 Abstract
Verifying closed-loop vision-based control systems remains a fundamental challenge due to the high dimensionality of images and the difficulty of modeling visual environments. While generative models are increasingly used as camera surrogates in verification, their reliance on stochastic latent variables introduces unnecessary overapproximation error. To address this bottleneck, we propose a Deterministic World Model (DWM) that maps system states directly to generative images, effectively eliminating uninterpretable latent variables to ensure precise input bounds. The DWM is trained with a dual-objective loss function that combines pixel-level reconstruction accuracy with a control difference loss to maintain behavioral consistency with the real system. We integrate DWM into a verification pipeline utilizing Star-based reachability analysis (StarV) and employ conformal prediction to derive rigorous statistical bounds on the trajectory deviation between the world model and the actual vision-based system. Experiments on standard benchmarks show that our approach yields significantly tighter reachable sets and better verification performance than a latent-variable baseline.