🤖 AI Summary
To address insufficient scenario diversity and inadequate coverage of the Operational Design Domain (ODD) in autonomous driving simulation testing, this paper proposes a diffusion model–based driving scene augmentation method. Our approach introduces three controllable diffusion enhancement strategies—instruction-guided editing, semantic-aware repair, and physics-informed fine-grained refinement—integrated within a system-level black-box testing framework. Crucially, we design the first semantic segmentation–driven invalid input detector for generated images, enabling automated semantic consistency verification. Experimental results demonstrate a significant improvement in ODD coverage, a low false-positive rate of 3% for the semantic validator, and successful identification of multiple novel ADS failure modes prior to real-vehicle deployment. The method bridges diffusion modeling, semantic understanding, and physical simulation, substantially enhancing simulation reliability and defect detection capability.
📝 Abstract
Simulation-based testing is widely used to assess the reliability of Autonomous Driving Systems (ADS), but its effectiveness is limited by the operational design domain (ODD) conditions available in such simulators. To address this limitation, in this work, we explore the integration of generative artificial intelligence techniques with physics-based simulators to enhance ADS system-level testing. Our study evaluates the effectiveness and computational overhead of three generative strategies based on diffusion models, namely instruction-editing, inpainting, and inpainting with refinement. Specifically, we assess these techniques' capabilities to produce augmented simulator-generated images of driving scenarios representing new ODDs. We employ a novel automated detector for invalid inputs based on semantic segmentation to ensure semantic preservation and realism of the neural generated images. We then perform system-level testing to evaluate the ADS's generalization ability to newly synthesized ODDs. Our findings show that diffusion models help increase the ODD coverage for system-level testing of ADS. Our automated semantic validator achieved a percentage of false positives as low as 3%, retaining the correctness and quality of the generated images for testing. Our approach successfully identified new ADS system failures before real-world testing.