Robustifying Diffusion-Denoised Smoothing Against Covariate Shift

๐Ÿ“… 2025-09-13
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
In randomized smoothing, employing pre-trained diffusion denoising models introduces covariate shift due to biased noise estimation, degrading certified robustness. This shift arises because the standard denoising objective is misaligned with the actual noise distribution induced by the smoothing process. To address this, we propose an adversarial training objective tailored to the diffusion process: adversarial perturbations are injected during the noise addition stage, explicitly adapting the base classifier to the true smoothed noise distribution. Crucially, our method requires no architectural modifications to the denoising model nor retraining of the diffusion modelโ€”covariate shift is mitigated at its source solely by reformulating the training objective. Evaluated on MNIST, CIFAR-10, and ImageNet under โ„“โ‚‚ perturbations, our approach achieves state-of-the-art certified accuracy, significantly outperforming existing randomized smoothing and diffusion-augmented robustness methods.

Technology Category

Application Category

๐Ÿ“ Abstract
Randomized smoothing is a well-established method for achieving certified robustness against l2-adversarial perturbations. By incorporating a denoiser before the base classifier, pretrained classifiers can be seamlessly integrated into randomized smoothing without significant performance degradation. Among existing methods, Diffusion Denoised Smoothing - where a pretrained denoising diffusion model serves as the denoiser - has produced state-of-the-art results. However, we show that employing a denoising diffusion model introduces a covariate shift via misestimation of the added noise, ultimately degrading the smoothed classifier's performance. To address this issue, we propose a novel adversarial objective function focused on the added noise of the denoising diffusion model. This approach is inspired by our understanding of the origin of the covariate shift. Our goal is to train the base classifier to ensure it is robust against the covariate shift introduced by the denoiser. Our method significantly improves certified accuracy across three standard classification benchmarks - MNIST, CIFAR-10, and ImageNet - achieving new state-of-the-art performance in l2-adversarial perturbations. Our implementation is publicly available at https://github.com/ahedayat/Robustifying-DDS-Against-Covariate-Shift
Problem

Research questions and friction points this paper is trying to address.

Addressing covariate shift in diffusion-denoised smoothing methods
Improving robustness against adversarial l2 perturbations in classifiers
Enhancing certified accuracy across MNIST, CIFAR-10 and ImageNet benchmarks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adversarial objective function for noise
Training base classifier against covariate shift
Improving certified accuracy across benchmarks
๐Ÿ”Ž Similar Papers
No similar papers found.
A
Ali Hedayatnia
Department of Electrical and Computer Engineering, University of Tehran, Tehran, Iran
M
Mostafa Tavassolipour
Department of Electrical and Computer Engineering, University of Tehran, Tehran, Iran
B
Babak Nadjar Araabi
Department of Electrical and Computer Engineering, University of Tehran, Tehran, Iran
Abdol-Hossein Vahabie
Abdol-Hossein Vahabie
School of ECE, & Faculty of Psychology, University of Tehran; School of Cognitive Sciences, IPM
Cognitive NeuroscienceNeuroeconomicsNeural DynamicsMachine LearningComputational Psychiatry