🤖 AI Summary
Existing fault localization methods struggle to integrate domain and structural knowledge from test engineers and lack probabilistic risk assessment of potential root causes, resulting in excessively large candidate root-cause sets and high diagnostic costs in complex systems. This paper proposes a Bayesian inference-based root-cause identification framework that, for the first time, introduces combinatorial hierarchical and hereditary prior modeling to capture failure-inducing structures—enabling expert knowledge encoding and uncertainty quantification. By synergistically integrating graph representation learning, integer programming optimization, and Bayesian inference, the framework achieves probabilistic root-cause ranking. Evaluated on two industrial case studies—TCAS and JMP Easy DOE—the method significantly outperforms state-of-the-art approaches: it reduces the average candidate root-cause set size by over 60%, substantially lowering diagnostic effort and cost.
📝 Abstract
Software testing is essential for the reliable development of complex software systems. A key step in software testing is fault localization, which uses test data to pinpoint failure-inducing combinations for further diagnosis. Existing fault localization methods have two key limitations: they (i) do not incorporate domain and/or structural knowledge from test engineers, and (ii) do not provide a probabilistic assessment of risk for potential root causes. Such methods can thus fail to confidently whittle down the combinatorial number of potential root causes in complex systems, resulting in prohibitively high testing costs. To address this, we propose a novel Bayesian fault localization framework called BayesFLo, which leverages a flexible Bayesian model for identifying potential root causes with probabilistic uncertainty. Using a carefully-specified prior on root cause probabilities, BayesFLo permits the integration of domain and structural knowledge via the principles of combination hierarchy and heredity, which capture the expected structure of failure-inducing combinations. We then develop new algorithms for efficient computation of posterior root cause probabilities, leveraging recent tools from integer programming and graph representations. Finally, we demonstrate the effectiveness of BayesFLo over existing methods in two fault localization case studies on the Traffic Alert and Collision Avoidance System and the JMP Easy DOE platform.