🤖 AI Summary
Existing diffusion-based adversarial attack methods suffer from low efficiency and poor generalization, either requiring hundreds of sampling steps or model fine-tuning. This paper proposes a training-free black-box adversarial generation method: it injects perturbations within a mixed-step interval of the diffusion process and employs a selective RGB-channel perturbation strategy guided jointly by attention maps and GradCAM. The approach efficiently generates high-fidelity adversarial examples in only 3–20 diffusion steps, requiring only an unconditional diffusion model while maintaining both high attack success rates and visual fidelity (PSNR > 30 dB). On ImageNet, it achieves attack success rates of 70.6%, 80.8%, and 97.8% against ResNet, MNASNet, and ShuffleNet, respectively—outperforming state-of-the-art methods by up to 10× in speed and significantly degrading target models’ defensive robust accuracy. To our knowledge, this is the first diffusion-based black-box attack that simultaneously eliminates the need for fine-tuning, reduces step count, and preserves perceptual quality.
📝 Abstract
Adversarial attacks from generative models often produce low-quality images and require substantial computational resources. Diffusion models, though capable of high-quality generation, typically need hundreds of sampling steps for adversarial generation. This paper introduces TAIGen, a training-free black-box method for efficient adversarial image generation. TAIGen produces adversarial examples using only 3-20 sampling steps from unconditional diffusion models. Our key finding is that perturbations injected during the mixing step interval achieve comparable attack effectiveness without processing all timesteps. We develop a selective RGB channel strategy that applies attention maps to the red channel while using GradCAM-guided perturbations on green and blue channels. This design preserves image structure while maximizing misclassification in target models. TAIGen maintains visual quality with PSNR above 30 dB across all tested datasets. On ImageNet with VGGNet as source, TAIGen achieves 70.6% success against ResNet, 80.8% against MNASNet, and 97.8% against ShuffleNet. The method generates adversarial examples 10x faster than existing diffusion-based attacks. Our method achieves the lowest robust accuracy, indicating it is the most impactful attack as the defense mechanism is least successful in purifying the images generated by TAIGen.