Provable Robustness against Backdoor Attacks via the Primal-Dual Perspective on Differential Privacy

๐Ÿ“… 2026-05-20
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

239K/year
๐Ÿค– AI Summary
This work addresses the challenge that existing randomized smoothing approaches struggle to provide unified robustness guarantees against backdoor attacks involving perturbations in both training and testing phases. To bridge this gap, the paper introduces, for the first time, the duality theory of differential privacy to jointly model the randomization mechanisms employed during training and inference through a privacy profile. This yields a modular, composable end-to-end certification framework that seamlessly integrates DP-SGD, deep partition aggregation, and inference-time randomized smoothing. The framework enables joint analysis and robustness certification across heterogeneous randomization mechanisms. Experimental results on MNIST and CIFAR-10 demonstrate that the proposed method delivers tight robustness guarantees against training-inference joint backdoor attacks.
๐Ÿ“ Abstract
Randomized smoothing is a powerful tool for certifying robustness to adversarial perturbations, including poisoning attacks via randomized training and evasion attacks via randomized inference. Extending these guarantees to backdoor attacks, where training and test data are jointly perturbed, remains challenging because training- and test-time randomized mechanisms must be analyzed within a single robustness certificate. We address this by connecting randomized smoothing to the dual view of differential privacy through privacy profiles, which provide a numerical procedure for composing heterogeneous mechanisms. The resulting framework enables tight, modular, end-to-end certification of complex, composed mechanisms while leveraging existing analyses of differentially private mechanisms. We instantiate the framework for DP-SGD and Deep Partition Aggregation with inference-time smoothing, deriving joint robustness guarantees against both training-time and inference-time attacks. Experiments on MNIST and CIFAR-10 demonstrate the effectiveness of our framework. Overall, we provide a principled and general framework for using composite mechanisms to certify robustness under complex threat models that better capture the capabilities of real-world adversaries.
Problem

Research questions and friction points this paper is trying to address.

Backdoor Attacks
Provable Robustness
Randomized Smoothing
Differential Privacy
Robustness Certification
Innovation

Methods, ideas, or system contributions that make the work stand out.

randomized smoothing
differential privacy
backdoor attacks
privacy profiles
robustness certification