Memory Efficient Full-gradient Attacks (MEFA) Framework for Adversarial Defense Evaluations

๐Ÿ“… 2026-05-07
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

248K/year
๐Ÿค– AI Summary
Existing white-box attacks often resort to approximate gradients when evaluating iterative stochastic purification defenses due to memory constraints, which weakens attack strength and leads to an overestimation of model robustness. This work proposes a memory-efficient full-gradient attack framework that integrates gradient checkpointing with a controllable randomness protocol, enablingโ€”for the first timeโ€”exact end-to-end white-box attacks against long-trajectory stochastic defenses such as diffusion- and Langevin-based purification. The method achieves state-of-the-art attack performance under both โ„“โˆž and โ„“โ‚‚ norms, uncovers vulnerabilities missed by approximate-gradient approaches, and facilitates out-of-distribution robustness analysis, thereby substantially improving the reliability of robustness evaluation.
๐Ÿ“ Abstract
This work studies the robust evaluation of iterative stochastic purification defenses under white-box adversarial attacks. Our key technical insight is that gradient checkpointing makes exact end-to-end gradient computation through long purification trajectories practical by trading additional recomputation for substantially lower memory usage. This enables full-gradient adaptive attacks against diffusion- and Langevin-based purification defenses, where prior evaluations often resort to approximate backpropagation due to memory constraints. These approximations can weaken the attack signal and risk overestimating robustness. In parallel, stochasticity in iterative purification is frequently under-controlled, even though different purification trajectories can substantially change reported robustness metrics. Building on this insight, we introduce a memory-efficient full-gradient evaluation framework for stochastic purification defenses. The framework combines checkpointed backpropagation with evaluation protocols that control stochastic variability, thereby reducing memory bottlenecks while preserving exact gradients. We evaluate diffusion-based purification and Langevin sampling with Energy-Based Models (EBMs), demonstrating that full-gradient attacks uncover vulnerabilities missed by approximate-gradient evaluations. Our framework yields stronger state-of-the-art $\ell_{\infty}$ and $\ell_{2}$ white-box attacks and further supports probing out-of-distribution robustness. Overall, our results show that exact-gradient evaluation is essential for reliable benchmarking of iterative stochastic defenses.
Problem

Research questions and friction points this paper is trying to address.

adversarial defense evaluation
stochastic purification
gradient approximation
memory bottleneck
robustness overestimation
Innovation

Methods, ideas, or system contributions that make the work stand out.

gradient checkpointing
full-gradient attack
stochastic purification
adversarial robustness evaluation
memory-efficient backpropagation
๐Ÿ”Ž Similar Papers
2024-04-30AAAI Conference on Artificial IntelligenceCitations: 6