🤖 AI Summary
This work proposes SafePlanner, a novel approach to uncover safety-critical flaws in autonomous driving planning models caused by scene transitions and control logic. SafePlanner is the first to integrate structural analysis of planning code with test scenario generation, systematically identifying hazardous planning behaviors by extracting scene transition logic, modeling NPC (non-player character) behaviors, and employing directed fuzz testing. Evaluated on Baidu Apollo, SafePlanner generated 20,635 test cases, revealing 520 dangerous behaviors categorized into 15 root-cause types, and successfully enabled the repair of four such issues. The method achieves a functional coverage of 83.63% and decision coverage of 63.22%, significantly outperforming baseline approaches.
📝 Abstract
In this work, we present SafePlanner, a systematic testing framework for identifying safety-critical flaws in the Plan model of Automated Driving Systems (ADS). SafePlanner targets two core challenges: generating structurally meaningful test scenarios and detecting hazardous planning behaviors. To maximize coverage, SafePlanner performs a structural analysis of the Plan model implementation - specifically, its scene-transition logic and hierarchical control flow - and uses this insight to extract feasible scene transitions from code. It then composes test scenarios by combining these transitions with non-player vehicle (NPC) behaviors. Guided fuzzing is applied to explore the behavioral space of the Plan model under these scenarios. We evaluate SafePlanner on Baidu Apollo, a production-grade level 4 ADS. It generates 20635 test cases and detects 520 hazardous behaviors, grouped into 15 root causes through manual analysis. For four of these, we applied patches based on our analysis; the issues disappeared, and no apparent side effects were observed. SafePlanner achieves 83.63 percent function and 63.22 percent decision coverage on the Plan model, outperforming baselines in both bug discovery and efficiency.