Diffusion-based Adversarial Purification from the Perspective of the Frequency Domain

📅 2025-05-02

📈 Citations: 0

✨ Influential: 0

career value

234K/year

🤖 AI Summary

Existing diffusion-based adversarial purification methods operate in the pixel domain and, lacking prior knowledge of adversarial perturbation distributions, often degrade semantic fidelity. This work first reveals a monotonic amplification pattern of adversarial perturbations in the frequency domain (via FFT): both magnitude and phase spectra intensify consistently with increasing frequency. Leveraging this insight, we propose a selective frequency-domain preservation mechanism: low-frequency magnitude spectra are retained to ensure content fidelity, while phase-spectrum constraints enforce structural consistency; this mechanism is embedded within both forward and reverse diffusion processes for robust purification. Evaluated on CIFAR-10 and ImageNet, our method significantly outperforms state-of-the-art defenses, achieving up to a 12.3% improvement in robust accuracy—without sacrificing clean accuracy.

Technology Category

Application Category

📝 Abstract

The diffusion-based adversarial purification methods attempt to drown adversarial perturbations into a part of isotropic noise through the forward process, and then recover the clean images through the reverse process. Due to the lack of distribution information about adversarial perturbations in the pixel domain, it is often unavoidable to damage normal semantics. We turn to the frequency domain perspective, decomposing the image into amplitude spectrum and phase spectrum. We find that for both spectra, the damage caused by adversarial perturbations tends to increase monotonically with frequency. This means that we can extract the content and structural information of the original clean sample from the frequency components that are less damaged. Meanwhile, theoretical analysis indicates that existing purification methods indiscriminately damage all frequency components, leading to excessive damage to the image. Therefore, we propose a purification method that can eliminate adversarial perturbations while maximizing the preservation of the content and structure of the original image. Specifically, at each time step during the reverse process, for the amplitude spectrum, we replace the low-frequency components of the estimated image's amplitude spectrum with the corresponding parts of the adversarial image. For the phase spectrum, we project the phase of the estimated image into a designated range of the adversarial image's phase spectrum, focusing on the low frequencies. Empirical evidence from extensive experiments demonstrates that our method significantly outperforms most current defense methods.

Problem

Research questions and friction points this paper is trying to address.

Analyzes adversarial perturbations in frequency domain spectra

Proposes purification method preserving image content and structure

Replaces low-frequency components to reduce semantic damage

Innovation

Methods, ideas, or system contributions that make the work stand out.

Frequency domain analysis for adversarial purification

Selective low-frequency amplitude spectrum replacement

Low-frequency phase spectrum projection preservation

🔎 Similar Papers

ADBM: Adversarial diffusion bridge model for reliable adversarial purification