Divide and Conquer: Heterogeneous Noise Integration for Diffusion-based Adversarial Purification

📅 2025-03-03

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

Existing diffusion-based purification methods employ uniform noise injection, which often corrupts benign pixels and undermines defense efficacy. To address this, we propose a model-interpretability-driven heterogeneous noise purification strategy: high-intensity noise is selectively injected into regions of high neural saliency—where the target model exhibits strong activation—while low-intensity noise is preserved in low-saliency regions, thereby precisely suppressing adversarial perturbations. We further introduce a novel single-step adaptive resampling framework that integrates neural saliency analysis, customized diffusion scheduling, and a lightweight noise intensity modulation module. Evaluated on three standard benchmarks, our method achieves a 12.7% improvement in robust accuracy over state-of-the-art adversarial training and purification approaches. Moreover, it significantly reduces computational overhead—both inference time and GPU memory consumption—without compromising purification fidelity, thus achieving an optimal trade-off between robustness and efficiency.

Technology Category

Application Category

📝 Abstract

Existing diffusion-based purification methods aim to disrupt adversarial perturbations by introducing a certain amount of noise through a forward diffusion process, followed by a reverse process to recover clean examples. However, this approach is fundamentally flawed: the uniform operation of the forward process across all pixels compromises normal pixels while attempting to combat adversarial perturbations, resulting in the target model producing incorrect predictions. Simply relying on low-intensity noise is insufficient for effective defense. To address this critical issue, we implement a heterogeneous purification strategy grounded in the interpretability of neural networks. Our method decisively applies higher-intensity noise to specific pixels that the target model focuses on while the remaining pixels are subjected to only low-intensity noise. This requirement motivates us to redesign the sampling process of the diffusion model, allowing for the effective removal of varying noise levels. Furthermore, to evaluate our method against strong adaptative attack, our proposed method sharply reduces time cost and memory usage through a single-step resampling. The empirical evidence from extensive experiments across three datasets demonstrates that our method outperforms most current adversarial training and purification techniques by a substantial margin.

Problem

Research questions and friction points this paper is trying to address.

Addresses uniform noise compromising normal pixels in diffusion-based purification.

Proposes heterogeneous noise strategy targeting specific pixels for effective defense.

Reduces time and memory costs with single-step resampling against adaptive attacks.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Heterogeneous noise integration for purification

Redesigned diffusion model sampling process

Single-step resampling reduces time and memory

🔎 Similar Papers

ADBM: Adversarial diffusion bridge model for reliable adversarial purification