A Grey-box Attack against Latent Diffusion Model-based Image Editing by Posterior Collapse

📅 2024-08-20

🏛️ arXiv.org

📈 Citations: 4

✨ Influential: 0

career value

192K/year

🤖 AI Summary

This work addresses the growing risks of copyright infringement and data theft in latent diffusion model (LDM)-based image editing. We propose a lightweight gray-box attack that requires access only to the VAE encoder parameters of the target LDM. By applying gradient-guided perturbations in the latent space, our method actively induces posterior collapse—thereby severely degrading semantic consistency in generated images. Crucially, we are the first to repurpose the well-known posterior collapse phenomenon from VAE training as a controllable adversarial mechanism, eliminating strong architectural assumptions about the target model and enabling high cross-model transferability with minimal intrusiveness. Extensive experiments demonstrate that our approach achieves superior semantic collapse across diverse LDM variants compared to state-of-the-art methods, while reducing inference latency and GPU memory consumption by over 30%.

Technology Category

Application Category

📝 Abstract

Recent advancements in generative AI, particularly Latent Diffusion Models (LDMs), have revolutionized image synthesis and manipulation. However, these generative techniques raises concerns about data misappropriation and intellectual property infringement. Adversarial attacks on machine learning models have been extensively studied, and a well-established body of research has extended these techniques as a benign metric to prevent the underlying misuse of generative AI. Current approaches to safeguarding images from manipulation by LDMs are limited by their reliance on model-specific knowledge and their inability to significantly degrade semantic quality of generated images. In response to these shortcomings, we propose the Posterior Collapse Attack (PCA) based on the observation that VAEs suffer from posterior collapse during training. Our method minimizes dependence on the white-box information of target models to get rid of the implicit reliance on model-specific knowledge. By accessing merely a small amount of LDM parameters, in specific merely the VAE encoder of LDMs, our method causes a substantial semantic collapse in generation quality, particularly in perceptual consistency, and demonstrates strong transferability across various model architectures. Experimental results show that PCA achieves superior perturbation effects on image generation of LDMs with lower runtime and VRAM. Our method outperforms existing techniques, offering a more robust and generalizable solution that is helpful in alleviating the socio-technical challenges posed by the rapidly evolving landscape of generative AI.

Problem

Research questions and friction points this paper is trying to address.

Addresses misuse of Latent Diffusion Models

Reduces reliance on model-specific knowledge

Degrades semantic quality in generated images

Innovation

Methods, ideas, or system contributions that make the work stand out.

Posterior Collapse Attack (PCA)

Minimizes white-box information reliance

Causes semantic collapse in generation

🔎 Similar Papers

Pixel Is Not A Barrier: An Effective Evasion Attack for Pixel-Domain Diffusion Models