๐ค AI Summary
Existing diffusion-based purification methods (e.g., DiffPure) suffer from an inherent trade-off between adversarial noise removal and faithful reconstruction of clean data, while their evaluation relies on weak adaptive attacks, compromising robustness assessment. To address this, we propose the Adversarial Diffusion Bridge Model (ADBM), which constructs a direct reverse diffusion bridge from adversarial examples to clean samplesโenabling, for the first time, decoupled noise suppression and structural fidelity preservation in pre-trained diffusion models for purification. ADBM is grounded in theoretically principled diffusion process reparameterization and adversarial bridge modeling, and introduces a rigorous robustness evaluation paradigm resistant to strong adaptive attacks. On CIFAR-10, CIFAR-100, and ImageNet, ADBM achieves an average purification accuracy 7.2% higher than DiffPure, while maintaining โฅ91.5% defense success rates against strong adaptive attacks.
๐ Abstract
Recently Diffusion-based Purification (DiffPure) has been recognized as an effective defense method against adversarial examples. However, we find DiffPure which directly employs the original pre-trained diffusion models for adversarial purification, to be suboptimal. This is due to an inherent trade-off between noise purification performance and data recovery quality. Additionally, the reliability of existing evaluations for DiffPure is questionable, as they rely on weak adaptive attacks. In this work, we propose a novel Adversarial Diffusion Bridge Model, termed ADBM. ADBM directly constructs a reverse bridge from the diffused adversarial data back to its original clean examples, enhancing the purification capabilities of the original diffusion models. Through theoretical analysis and experimental validation across various scenarios, ADBM has proven to be a superior and robust defense mechanism, offering significant promise for practical applications.