SmartFL: Semantics Based Probabilistic Fault Localization

📅 2025-03-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Semantic modeling in test-driven fault localization faces a fundamental trade-off between precision and scalability. Method: This paper proposes a “value-correctness-oriented” lightweight semantic modeling paradigm that captures only the expected correctness of program variable values—rather than full program semantics—enabling high localization accuracy without sacrificing efficiency. It integrates probabilistic graphical models, dynamic instrumentation, lightweight semantic abstraction, and efficient likelihood estimation within a unified CombineFL ensemble framework. Results: Evaluated on Defects4J 2.0, the approach achieves a top-1 statement-level accuracy of 14%, outperforming state-of-the-art SBFL/MBFL by 130%; average per-defect analysis time is 205 seconds—half that of SBFL; ensemble integration further improves top-1/3/5 accuracy by 10%. To our knowledge, this is the first approach to achieve Pareto-optimal balance among semantic granularity, accuracy, and efficiency.

Technology Category

Application Category

📝 Abstract
Testing-based fault localization has been a research focus in software engineering in the past decades. It localizes faulty program elements based on a set of passing and failing test executions. Since whether a fault could be triggered and detected by a test is related to program semantics, it is crucial to model program semantics in fault localization approaches. Existing approaches either consider the full semantics of the program (e.g., mutation-based fault localization and angelic debugging), leading to scalability issues, or ignore the semantics of the program (e.g., spectrum-based fault localization), leading to imprecise localization results. Our key idea is: by modeling only the correctness of program values but not their full semantics, a balance could be reached between effectiveness and scalability. To realize this idea, we introduce a probabilistic model by efficient approximation of program semantics and several techniques to address scalability challenges. Our approach, SmartFL(SeMantics bAsed pRobabilisTic Fault Localization), is evaluated on a real-world dataset, Defects4J 2.0. The top-1 statement-level accuracy of our approach is {14%}, which improves 130% over the best SBFL and MBFL methods. The average time cost is {205} seconds per fault, which is half of SBFL methods. After combining our approach with existing approaches using the CombineFL framework, the performance of the combined approach is significantly boosted by an average of 10% on top-1, top-3, and top-5 accuracy compared to state-of-the-art combination methods.
Problem

Research questions and friction points this paper is trying to address.

Balancing effectiveness and scalability in fault localization
Modeling program semantics for precise fault detection
Improving accuracy and speed over existing localization methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

Models program values correctness, not full semantics
Uses probabilistic model with efficient semantics approximation
Balances effectiveness and scalability in fault localization
🔎 Similar Papers
No similar papers found.
Yiqian Wu
Yiqian Wu
Ph.D. candidate, Zhejiang University
Digital humans
Y
Yujie Liu
Key Laboratory of High Confidence Software Technologies (Peking University), Ministry of Education; School of Computer Science, Peking University, Beijing, China.
Y
Yi Yin
Key Laboratory of High Confidence Software Technologies (Peking University), Ministry of Education; School of Computer Science, Peking University, Beijing, China.
M
Muhan Zeng
Key Laboratory of High Confidence Software Technologies (Peking University), Ministry of Education; School of Computer Science, Peking University, Beijing, China.
Z
Zhentao Ye
Key Laboratory of High Confidence Software Technologies (Peking University), Ministry of Education; School of Computer Science, Peking University, Beijing, China.
X
Xin Zhang
Key Laboratory of High Confidence Software Technologies (Peking University), Ministry of Education; School of Computer Science, Peking University, Beijing, China.
Yingfei Xiong
Yingfei Xiong
Associate Professor, Peking University
Software EngineeringProgramming LanguagesProgram RepairProgram SynthesisProgram Analysis
L
Lu Zhang
Key Laboratory of High Confidence Software Technologies (Peking University), Ministry of Education; School of Computer Science, Peking University, Beijing, China.