XMutant: XAI-based Fuzzing for Deep Learning Systems

📅 2025-03-10

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

To address the low efficiency of random perturbations in semantic fuzz testing of deep learning systems, this paper proposes an eXplainable AI (XAI)-driven directed semantic fuzz testing method. The core innovation lies in the first integration of local interpretability techniques—specifically LIME and SHAP—into the fuzz testing feedback loop, enabling gradient-guided perturbation generation to produce high-risk semantic variants. We further design multi-granularity semantic operators and a human-in-the-loop verification mechanism. The method achieves dual-level efficacy: at both the model and system levels, it increases fault-inducing input generation by 125% over baseline approaches, accelerates generation speed by 7×, and maintains a stable verification pass rate above 89%.

Technology Category

Application Category

📝 Abstract

Semantic-based test generators are widely used to produce failure-inducing inputs for Deep Learning (DL) systems. They typically generate challenging test inputs by applying random perturbations to input semantic concepts until a failure is found or a timeout is reached. However, such randomness may hinder them from efficiently achieving their goal. This paper proposes XMutant, a technique that leverages explainable artificial intelligence (XAI) techniques to generate challenging test inputs. XMutant uses the local explanation of the input to inform the fuzz testing process and effectively guide it toward failures of the DL system under test. We evaluated different configurations of XMutant in triggering failures for different DL systems both for model-level (sentiment analysis, digit recognition) and system-level testing (advanced driving assistance). Our studies showed that XMutant enables more effective and efficient test generation by focusing on the most impactful parts of the input. XMutant generates up to 125% more failure-inducing inputs compared to an existing baseline, up to 7X faster. We also assessed the validity of these inputs, maintaining a validation rate above 89%, according to automated and human validators.

Problem

Research questions and friction points this paper is trying to address.

Improves efficiency of generating failure-inducing inputs for DL systems.

Uses XAI to guide fuzz testing towards DL system failures.

Enhances test generation effectiveness by focusing on impactful input parts.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages XAI for targeted fuzz testing

Guides test generation using local explanations

Increases failure-inducing inputs by 125%

🔎 Similar Papers

On the Challenges of Fuzzing Techniques via Large Language Models