Automatically Learning a Precise Measurement for Fault Diagnosis Capability of Test Cases

📅 2025-01-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing Fault Diagnosis Capability (FDC) assessment methods heavily rely on test execution outcomes, suffering from heuristic design limitations, poor generalizability, and low diagnostic accuracy. To address these issues in software fault localization (FL), this paper introduces, for the first time, a reinforcement learning (RL)-based approach to FDC modeling, proposing the execution-result-agnostic metric RLFDC. RLFDC constructs a reward mechanism grounded in FL feedback signals to automatically learn the contribution of each test case to fault localization—eliminating reliance on manual heuristics. The method jointly models test coverage features and execution behaviors to enable end-to-end FDC prediction. Extensive experiments on the Defects4J benchmark demonstrate that RLFDC consistently outperforms existing result-agnostic metrics in both test selection and test generation tasks, significantly improving FL accuracy and efficiency. This work establishes a novel paradigm for intelligent test optimization.

Technology Category

Application Category

📝 Abstract
Prevalent Fault Localization (FL) techniques rely on tests to localize buggy program elements. Tests could be treated as fuel to further boost FL by providing more debugging information. Therefore, it is highly valuable to measure the Fault Diagnosis Capability (FDC) of a test for diagnosing faults, so as to select or generate tests to better help FL. To this end, researchers have proposed many FDC metrics, which serve as the selection criterion in FL-oriented test selection or the fitness function in FL-oriented test generation. Existing FDC metrics can be classified into result-agnostic and result-aware metrics depending on whether they take test results (i.e., passing or failing) as input. Although result-aware metrics perform better in test selection, they have restricted applications due to the input of test results, e.g., they cannot be applied to guide test generation. Moreover, all the existing FDC metrics are designed based on some predefined heuristics and have achieved limited FL performance due to their inaccuracy. To address these issues, in this paper, we reconsider result-agnostic metrics, and propose a novel result-agnostic metric RLFDC which predicts FDC values of tests through reinforcement learning. In particular, we treat FL results as reward signals, and train an FDC prediction model with the direct FL feedback to automatically learn a more accurate measurement rather than design one based on predefined heuristics. Finally, we evaluate the proposed RLFDC on Defects4J by applying the studied metrics to test selection and generation. According to the experimental results, the proposed RLFDC outperforms all the result-agnostic metrics in both test selection and generation.
Problem

Research questions and friction points this paper is trying to address.

Fault Detection Capability
Testing Effectiveness
Limitations in Test Selection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement Learning
Fault Detection Capability
Self-Optimization
🔎 Similar Papers
No similar papers found.
Y
Yifan Zhao
Key Laboratory of High Confidence Software Technologies (Peking University), Ministry of Education; School of Computer Science, Peking University, China
Z
Zeyu Sun
National Key Laboratory of Space Integrated Information System, Institute of Software, Chinese Academy of Sciences, China
G
Guoqing Wang
Key Laboratory of High Confidence Software Technologies (Peking University), Ministry of Education; School of Computer Science, Peking University, China
Qingyuan Liang
Qingyuan Liang
Peking University
Software EngineeringCode Generation
Yakun Zhang
Yakun Zhang
Harbin Institute of Technology, Shenzhen
Software EngineeringProgram AnalysisGUI AgentLarge Language Model
Yiling Lou
Yiling Lou
Fudan University, China
Software EngineeringTestingDebugging
Dan Hao
Dan Hao
Peking University
Software TestingSoftware EngineeringDebuggingCompiler Testing
L
Lu Zhang
Key Laboratory of High Confidence Software Technologies (Peking University), Ministry of Education; School of Computer Science, Peking University, China