Defending Against Frequency-Based Attacks with Diffusion Models

📅 2025-04-15

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

This paper addresses the limited robustness of deep models against adversarial attacks in both the frequency and spatial domains. We propose a general adversarial purification framework based on diffusion models, which operates without assumptions about specific attack types or joint training with classifiers. Guided by spectral analysis, our method performs denoising sampling that simultaneously suppresses non-pixel-level adversarial distortions across the full frequency spectrum—from low to high frequencies—in both domains. To our knowledge, this is the first systematic demonstration of diffusion models’ generalization capability in purifying broad-spectrum frequency-domain attacks (e.g., Fourier perturbations, spectral masking), overcoming the strong attack-type dependency inherent in conventional adversarial training. Evaluated on CIFAR-10 and ImageNet, our approach achieves an average robust accuracy improvement of 12.7%, significantly enhancing model resilience against unseen spectral adversarial attacks.

Technology Category

Application Category

📝 Abstract

Adversarial training is a common strategy for enhancing model robustness against adversarial attacks. However, it is typically tailored to the specific attack types it is trained on, limiting its ability to generalize to unseen threat models. Adversarial purification offers an alternative by leveraging a generative model to remove perturbations before classification. Since the purifier is trained independently of both the classifier and the threat models, it is better equipped to handle previously unseen attack scenarios. Diffusion models have proven highly effective for noise purification, not only in countering pixel-wise adversarial perturbations but also in addressing non-adversarial data shifts. In this study, we broaden the focus beyond pixel-wise robustness to explore the extent to which purification can mitigate both spectral and spatial adversarial attacks. Our findings highlight its effectiveness in handling diverse distortion patterns across low- to high-frequency regions.

Problem

Research questions and friction points this paper is trying to address.

Defending against spectral and spatial adversarial attacks

Improving robustness with diffusion-based purification

Handling diverse distortion patterns across frequencies

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses diffusion models for adversarial purification

Handles spectral and spatial adversarial attacks

Effective across low to high frequency regions

🔎 Similar Papers

No similar papers found.