Pixel Is Not A Barrier: An Effective Evasion Attack for Pixel-Domain Diffusion Models

📅 2024-08-21

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

To address the lack of effective adversarial attack methods against pixel-domain diffusion models (PDMs), this paper proposes AtkPDM—the first systematic framework to break state-of-the-art PDM-based image editing models such as SDEdit. Methodologically, AtkPDM introduces a representation-level attack loss targeting the internal feature vulnerabilities of the denoising U-Net, jointly optimizing latent variables and perturbing intermediate U-Net features via end-to-end pixel-domain gradient backpropagation. Its contributions are threefold: (1) it establishes the first robust adversarial attack paradigm specifically designed for PDM-based editing tasks; (2) it achieves high attack success rates (>92%), strong image fidelity (PSNR > 28 dB), and natural visual quality; and (3) it demonstrates remarkable robustness against common defenses—including JPEG compression and denoising—and transfers effectively to latent diffusion models (LDMs), attaining state-of-the-art performance.

Technology Category

Application Category

📝 Abstract

Diffusion Models have emerged as powerful generative models for high-quality image synthesis, with many subsequent image editing techniques based on them. However, the ease of text-based image editing introduces significant risks, such as malicious editing for scams or intellectual property infringement. Previous works have attempted to safeguard images from diffusion-based editing by adding imperceptible perturbations. These methods are costly and specifically target prevalent Latent Diffusion Models (LDMs), while Pixel-domain Diffusion Models (PDMs) remain largely unexplored and robust against such attacks. Our work addresses this gap by proposing a novel attack framework, AtkPDM. AtkPDM is mainly composed of a feature representation attacking loss that exploits vulnerabilities in denoising UNets and a latent optimization strategy to enhance the naturalness of adversarial images. Extensive experiments demonstrate the effectiveness of our approach in attacking dominant PDM-based editing methods (e.g., SDEdit) while maintaining reasonable fidelity and robustness against common defense methods. Additionally, our framework is extensible to LDMs, achieving comparable performance to existing approaches.

Problem

Research questions and friction points this paper is trying to address.

Adversarial Attacks

Pixel Domain Models

Image Editing Protection

Innovation

Methods, ideas, or system contributions that make the work stand out.

AtkPDM

Pixel Domain Manipulation

Robustness against Defense

🔎 Similar Papers

No similar papers found.