🤖 AI Summary
Existing Fault Diagnosis Capability (FDC) assessment methods heavily rely on test execution outcomes, suffering from heuristic design limitations, poor generalizability, and low diagnostic accuracy. To address these issues in software fault localization (FL), this paper introduces, for the first time, a reinforcement learning (RL)-based approach to FDC modeling, proposing the execution-result-agnostic metric RLFDC. RLFDC constructs a reward mechanism grounded in FL feedback signals to automatically learn the contribution of each test case to fault localization—eliminating reliance on manual heuristics. The method jointly models test coverage features and execution behaviors to enable end-to-end FDC prediction. Extensive experiments on the Defects4J benchmark demonstrate that RLFDC consistently outperforms existing result-agnostic metrics in both test selection and test generation tasks, significantly improving FL accuracy and efficiency. This work establishes a novel paradigm for intelligent test optimization.
📝 Abstract
Prevalent Fault Localization (FL) techniques rely on tests to localize buggy program elements. Tests could be treated as fuel to further boost FL by providing more debugging information. Therefore, it is highly valuable to measure the Fault Diagnosis Capability (FDC) of a test for diagnosing faults, so as to select or generate tests to better help FL. To this end, researchers have proposed many FDC metrics, which serve as the selection criterion in FL-oriented test selection or the fitness function in FL-oriented test generation. Existing FDC metrics can be classified into result-agnostic and result-aware metrics depending on whether they take test results (i.e., passing or failing) as input. Although result-aware metrics perform better in test selection, they have restricted applications due to the input of test results, e.g., they cannot be applied to guide test generation. Moreover, all the existing FDC metrics are designed based on some predefined heuristics and have achieved limited FL performance due to their inaccuracy. To address these issues, in this paper, we reconsider result-agnostic metrics, and propose a novel result-agnostic metric RLFDC which predicts FDC values of tests through reinforcement learning. In particular, we treat FL results as reward signals, and train an FDC prediction model with the direct FL feedback to automatically learn a more accurate measurement rather than design one based on predefined heuristics. Finally, we evaluate the proposed RLFDC on Defects4J by applying the studied metrics to test selection and generation. According to the experimental results, the proposed RLFDC outperforms all the result-agnostic metrics in both test selection and generation.