🤖 AI Summary
Prior work on diffusion-based denoising smoothing has focused almost exclusively on classification tasks, neglecting its safety–utility trade-offs in broader downstream applications such as object detection, semantic segmentation, and regression. Method: This paper systematically evaluates diffusion denoising smoothing across three dataset types, four downstream tasks, and three adversarial attack settings, using a pre-trained diffusion model as an input preprocessing framework to jointly assess robustness and accuracy. Contribution/Results: (1) High-noise diffusion severely degrades clean-sample performance—up to 57% accuracy drop; (2) we introduce the first adversarial attack specifically targeting the diffusion process, capable of bypassing defenses even under low-noise configurations; (3) we empirically demonstrate that low-noise diffusion lacks universal robustness, while high-noise variants incur prohibitive utility loss—revealing an inherent safety–utility trade-off. These findings provide both theoretical grounding and practical warnings for deploying diffusion-enhanced robustness in real-world multi-task settings.
📝 Abstract
While foundation models demonstrate impressive performance across various tasks, they remain vulnerable to adversarial inputs. Current research explores various approaches to enhance model robustness, with Diffusion Denoised Smoothing emerging as a particularly promising technique. This method employs a pretrained diffusion model to preprocess inputs before model inference. Yet, its effectiveness remains largely unexplored beyond classification. We aim to address this gap by analyzing three datasets with four distinct downstream tasks under three different adversarial attack algorithms. Our findings reveal that while foundation models maintain resilience against conventional transformations, applying high-noise diffusion denoising to clean images without any distortions significantly degrades performance by as high as 57%. Low-noise diffusion settings preserve performance but fail to provide adequate protection across all attack types. Moreover, we introduce a novel attack strategy specifically targeting the diffusion process itself, capable of circumventing defenses in the low-noise regime. Our results suggest that the trade-off between adversarial robustness and performance remains a challenge to be addressed.