🤖 AI Summary
This work addresses the testing challenge of high-dimensional, non-deterministic, and computationally expensive software systems—exemplified by the CARLA autonomous driving simulator—where latent variables and variable interactions undermine causal inference. Existing causal testing methods assume full observability and absence of interactions, rendering them inapplicable to realistic, partially observable settings. To overcome this, we introduce effect modification analysis and instrumental variable methods into software causal testing for the first time, establishing a robust verification framework capable of modeling latent variables and identifying interaction effects. Crucially, our approach requires neither full log recording nor source-code instrumentation; it achieves reliable validation of three system-level requirements in CARLA using only limited, controlled data under low observability. As a result, it substantially reduces dependence on large-scale test data and strong observability assumptions, advancing practical causal testing for complex cyber-physical systems.
📝 Abstract
Software systems with large parameter spaces, nondeterminism and high computational cost are challenging to test. Recently, software testing techniques based on causal inference have been successfully applied to systems that exhibit such characteristics, including scientific models and autonomous driving systems. One significant limitation is that these are restricted to test properties where all of the variables involved can be observed and where there are no interactions between variables. In practice, this is rarely guaranteed; the logging infrastructure may not be available to record all of the necessary runtime variable values, and it can often be the case that an output of the system can be affected by complex interactions between variables. To address this, we leverage two additional concepts from causal inference, namely effect modification and instrumental variable methods. We build these concepts into an existing causal testing tool and conduct an evaluative case study which uses the concepts to test three system-level requirements of CARLA, a high-fidelity driving simulator widely used in autonomous vehicle development and testing. The results show that we can obtain reliable test outcomes without requiring large amounts of highly controlled test data or instrumentation of the code, even when variables interact with each other and are not recorded in the test data.