π€ AI Summary
Existing fine-grained predictive mutation testing relies on deep learning, suffering from high computational overhead and limited support to intra-method mutants only. This paper proposes WITNESSβa lightweight, interpretable, and holistic predictive approach. Methodologically, WITNESS is the first to jointly model both intra- and inter-method mutants, leveraging a novel lightweight feature set that integrates semantic and structural information before and after mutation; it replaces deep models with efficient traditional machine learning to construct a mutant-killing matrix predictor. Evaluated on Defects4J, WITNESS substantially reduces computational cost while achieving state-of-the-art prediction accuracy. Moreover, its test-case prioritization performance closely approximates that of the ground-truth killing matrix and outperforms all baseline methods.
π Abstract
Existing fine-grained predictive mutation testing studies predominantly rely on deep learning, which faces two critical limitations in practice: (1) Exorbitant computational costs. The deep learning models adopted in these studies demand significant computational resources for training and inference acceleration. This introduces high costs and undermines the cost-reduction goal of predictive mutation testing. (2) Constrained applicability. Although modern mutation testing tools generate mutants both inside and outside methods, current fine-grained predictive mutation testing approaches handle only inside-method mutants. As a result, they cannot predict outside-method mutants, limiting their applicability in real-world scenarios. We propose WITNESS, a new fine-grained predictive mutation testing approach. WITNESS adopts a twofold design: (1) With collected features from both inside-method and outside-method mutants, WITNESS is suitable for all generated mutants. (2) Instead of using computationally expensive deep learning, WITNESS employs lightweight classical machine learning models for training and prediction. This makes it more cost-effective and enabling straightforward explanations of the decision-making processes behind the adopted models. Evaluations on Defects4J projects show that WITNESS consistently achieves state-of-the-art predictive performance across different scenarios. Additionally, WITNESS significantly enhances the efficiency of kill matrix prediction. Post-hoc analysis reveals that features incorporating information from before and after the mutation are the most important among those used in WITNESS. Test case prioritization based on the predicted kill matrix shows that WITNESS delivers results much closer to those obtained by using the actual kill matrix, outperforming baseline approaches.