Latent Diffusion Unlearning: Protecting Against Unauthorized Personalization Through Trajectory Shifted Perturbations

📅 2025-10-03

📈 Citations: 0

✨ Influential: 0

career value

177K/year

🤖 AI Summary

This work addresses unauthorized personalized usage and data inversion attacks against text-to-image diffusion models. We propose a latent-space trajectory perturbation method that introduces a “unlearning” mechanism into latent diffusion models (LDMs) for the first time. By alternately performing denoising and reverse diffusion operations in the latent space, our approach dynamically perturbs the sampling trajectory during generation, substantially enhancing image unlearnability. Crucially, it preserves high visual fidelity—improving PSNR, SSIM, and FID by 8%–10%—while effectively defending against model extraction and inversion-based personalized attacks, yielding an average robustness gain of ~10%. Extensive experiments across four benchmark datasets validate both the efficacy and generalizability of the proposed method.

Technology Category

Application Category

📝 Abstract

Text-to-image diffusion models have demonstrated remarkable effectiveness in rapid and high-fidelity personalization, even when provided with only a few user images. However, the effectiveness of personalization techniques has lead to concerns regarding data privacy, intellectual property protection, and unauthorized usage. To mitigate such unauthorized usage and model replication, the idea of generating ``unlearnable'' training samples utilizing image poisoning techniques has emerged. Existing methods for this have limited imperceptibility as they operate in the pixel space which results in images with noise and artifacts. In this work, we propose a novel model-based perturbation strategy that operates within the latent space of diffusion models. Our method alternates between denoising and inversion while modifying the starting point of the denoising trajectory: of diffusion models. This trajectory-shifted sampling ensures that the perturbed images maintain high visual fidelity to the original inputs while being resistant to inversion and personalization by downstream generative models. This approach integrates unlearnability into the framework of Latent Diffusion Models (LDMs), enabling a practical and imperceptible defense against unauthorized model adaptation. We validate our approach on four benchmark datasets to demonstrate robustness against state-of-the-art inversion attacks. Results demonstrate that our method achieves significant improvements in imperceptibility ($sim 8 % -10%$ on perceptual metrics including PSNR, SSIM, and FID) and robustness ( $sim 10%$ on average across five adversarial settings), highlighting its effectiveness in safeguarding sensitive data.

Problem

Research questions and friction points this paper is trying to address.

Protecting data privacy against unauthorized personalization in diffusion models

Creating imperceptible perturbations in latent space to prevent model replication

Ensuring visual fidelity while resisting inversion attacks on sensitive data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Latent space perturbation strategy in diffusion models

Trajectory-shifted sampling for high visual fidelity

Integrates unlearnability into Latent Diffusion Models framework

🔎 Similar Papers

No similar papers found.