New Formulation of DNN Statistical Mutation Killing for Ensuring Monotonicity: A Technical Report

📅 2025-07-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing statistical mutation-killing criteria, such as DeepCrime, violate monotonicity: expanding the test set may overturn prior “killed” judgments to “survived”, undermining reliability and interpretability of DNN mutation testing. Method: This paper introduces Fisher’s exact test—the first application of rigorous statistical hypothesis testing in DNN mutation testing—to establish a monotonic framework. It models output-behavior discrepancies under input perturbations and strictly controls Type-I error rates. Contribution/Results: The proposed method guarantees that once a mutant is declared “killed”, it remains so under any superset of the test data, thereby resolving the non-monotonicity issue fundamentally while preserving statistical rigor. Empirical evaluation on CIFAR-10 and ImageNet demonstrates significantly improved stability and reproducibility of mutation detection. Our approach establishes a theoretically sound and practically viable paradigm for assessing the effectiveness of DNN testing.

Technology Category

Application Category

📝 Abstract
Mutation testing has emerged as a powerful technique for evaluating the effectiveness of test suites for Deep Neural Networks. Among existing approaches, the statistical mutant killing criterion of DeepCrime has leveraged statistical testing to determine whether a mutant significantly differs from the original model. However, it suffers from a critical limitation: it violates the monotonicity property, meaning that expanding a test set may result in previously killed mutants no longer being classified as killed. In this technical report, we propose a new formulation of statistical mutant killing based on Fisher exact test that preserves the statistical rigour of it while ensuring monotonicity.
Problem

Research questions and friction points this paper is trying to address.

Ensures monotonicity in DNN mutation testing
Addresses non-monotonicity in statistical mutant killing
Proposes Fisher exact test for consistent mutant classification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Fisher exact test for statistical rigor
Ensures monotonicity in mutation testing
Improves DeepCrime's mutant killing criterion
🔎 Similar Papers
No similar papers found.