Shapley-Guided Neural Repair Approach via Derivative-Free Optimization

📅 2026-03-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Deep neural networks are vulnerable to various defects, including backdoor attacks, adversarial examples, and fairness violations. Existing repair methods often rely on gradient information or suffer from limited interpretability and generalizability. This work proposes the first interpretable fault localization framework based on Shapley values (Deep SHAP), integrating activation discrepancy analysis to hierarchically identify critical faulty layers and neurons. It further introduces a gradient-free CMA-ES evolutionary optimization strategy for collaborative model repair. The approach requires no gradient information, supports cross-architecture applicability, and is compatible with multiple model attributes. Empirical results demonstrate significant improvements in repair efficacy—achieving gains of 10.56%, 5.78%, and 11.82% in backdoor removal, adversarial robustness enhancement, and fairness restoration, respectively—while preserving original model accuracy.
📝 Abstract
DNNs are susceptible to defects like backdoors, adversarial attacks, and unfairness, undermining their reliability. Existing approaches mainly involve retraining, optimization, constraint-solving, or search algorithms. However, most methods rely on gradient calculations, restricting applicability to specific activation functions (e.g., ReLU), or use search algorithms with uninterpretable localization and repair. Furthermore, they often lack generalizability across multiple properties. We propose SHARPEN, integrating interpretable fault localization with a derivative-free optimization strategy. First, SHARPEN introduces a Deep SHAP-based localization strategy quantifying each layer's and neuron's marginal contribution to erroneous outputs. Specifically, a hierarchical coarse-to-fine approach reranks layers by aggregated impact, then locates faulty neurons/filters by analyzing activation divergences between property-violating and benign states. Subsequently, SHARPEN incorporates CMA-ES to repair identified neurons. CMA-ES leverages a covariance matrix to capture variable dependencies, enabling gradient-free search and coordinated adjustments across coupled neurons. By combining interpretable localization with evolutionary optimization, SHARPEN enables derivative-free repair across architectures, being less sensitive to gradient anomalies and hyperparameters. We demonstrate SHARPEN's effectiveness on three repair tasks. Balancing property repair and accuracy preservation, it outperforms baselines in backdoor removal (+10.56%), adversarial mitigation (+5.78%), and unfairness repair (+11.82%). Notably, SHARPEN handles diverse tasks, and its modular design is plug-and-play with different derivative-free optimizers, highlighting its flexibility.
Problem

Research questions and friction points this paper is trying to address.

backdoor
adversarial attacks
unfairness
neural repair
reliability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Shapley value
derivative-free optimization
fault localization
CMA-ES
neural repair
🔎 Similar Papers
No similar papers found.
X
Xinyu Sun
National University of Defense Technology
W
Wanwei Liu
National University of Defense Technology
H
Haoang Chi
National University of Defense Technology
T
Tingyu Chen
National University of Defense Technology
X
Xiaoguang Mao
National University of Defense Technology
Shangwen Wang
Shangwen Wang
National University of Defense Technology
software engineering
Lei Bu
Lei Bu
Nanjing University
Model CheckingHybrid SystemCyber-Physical SystemFormal Verification
Jingyi Wang
Jingyi Wang
Assistant Professor, Zhejiang University
Trustworthy AISoftware EngineeringFormal MethodsSecurity
Y
Yang Tan
National University of Defense Technology
Z
Zhenyi Qi
National University of Defense Technology