🤖 AI Summary
To address the degradation in fault detection capability and poor scalability caused by redundancy removal in Multi-Criteria Test Suite Minimization (MCTSM), this paper proposes a synergistic optimization method integrating statement coverage, fault detection effectiveness, and test-case coverage similarity. Innovatively, coverage similarity is formulated as an Integer Linear Programming (ILP) constraint, and the resulting NP-hard problem is solved collaboratively via bipartite graph embedding and Proximal Policy Optimization (PPO)-based reinforcement learning. Evaluated on the Defects4J benchmark, our approach completes minimization in an average of 47 minutes, exhibiting linear scalability with input size. While preserving original statement coverage and known-defect detection rates, it significantly improves detection of previously unknown faults. The resulting minimized test suites achieve superior conciseness, diversity, and fault-detection effectiveness.
📝 Abstract
The Multi-Criteria Test Suite Minimization (MCTSM) problem aims to remove redundant test cases, guided by adequacy criteria such as code coverage or fault detection capability. However, current techniques either exhibit a high loss of fault detection ability or face scalability challenges due to the NP-hard nature of the problem, which limits their practical utility. We propose TripRL, a novel technique that integrates traditional criteria such as statement coverage and fault detection ability with test coverage similarity into an Integer Linear Program (ILP), to produce a diverse reduced test suite with high test effectiveness. TripRL leverages bipartite graph representation and its embedding for concise ILP formulation and combines ILP with effective reinforcement learning (RL) training. This combination renders large-scale test suite minimization more scalable and enhances test effectiveness. Our empirical evaluations demonstrate that TripRL's runtime scales linearly with the magnitude of the MCTSM problem. Notably, for large test suites from the Defects4j dataset where existing approaches fail to provide solutions within a reasonable time frame, our technique consistently delivers solutions in less than 47 minutes. The reduced test suites produced by TripRL also maintain the original statement coverage and fault detection ability while having a higher potential to detect unknown faults.