🤖 AI Summary
Semantic modeling in test-driven fault localization faces a fundamental trade-off between precision and scalability. Method: This paper proposes a “value-correctness-oriented” lightweight semantic modeling paradigm that captures only the expected correctness of program variable values—rather than full program semantics—enabling high localization accuracy without sacrificing efficiency. It integrates probabilistic graphical models, dynamic instrumentation, lightweight semantic abstraction, and efficient likelihood estimation within a unified CombineFL ensemble framework. Results: Evaluated on Defects4J 2.0, the approach achieves a top-1 statement-level accuracy of 14%, outperforming state-of-the-art SBFL/MBFL by 130%; average per-defect analysis time is 205 seconds—half that of SBFL; ensemble integration further improves top-1/3/5 accuracy by 10%. To our knowledge, this is the first approach to achieve Pareto-optimal balance among semantic granularity, accuracy, and efficiency.
📝 Abstract
Testing-based fault localization has been a research focus in software engineering in the past decades. It localizes faulty program elements based on a set of passing and failing test executions. Since whether a fault could be triggered and detected by a test is related to program semantics, it is crucial to model program semantics in fault localization approaches. Existing approaches either consider the full semantics of the program (e.g., mutation-based fault localization and angelic debugging), leading to scalability issues, or ignore the semantics of the program (e.g., spectrum-based fault localization), leading to imprecise localization results. Our key idea is: by modeling only the correctness of program values but not their full semantics, a balance could be reached between effectiveness and scalability. To realize this idea, we introduce a probabilistic model by efficient approximation of program semantics and several techniques to address scalability challenges. Our approach, SmartFL(SeMantics bAsed pRobabilisTic Fault Localization), is evaluated on a real-world dataset, Defects4J 2.0. The top-1 statement-level accuracy of our approach is {14%}, which improves 130% over the best SBFL and MBFL methods. The average time cost is {205} seconds per fault, which is half of SBFL methods. After combining our approach with existing approaches using the CombineFL framework, the performance of the combined approach is significantly boosted by an average of 10% on top-1, top-3, and top-5 accuracy compared to state-of-the-art combination methods.