🤖 AI Summary
Diffusion models (e.g., Stable Diffusion) pose significant security risks—including deepfakes and copyright infringement—due to their strong customization capabilities; existing protective perturbations are easily removed by denoising-based purification methods, rendering defenses ineffective. To address this, we formally define the “purification-resistant” protection task and propose a robust perturbation method featuring two synergistic mechanisms: patch-wise frequency-block guidance and erroneous timestep guidance. These jointly embed subtle, persistent perturbations into the diffusion process, enabling fine-grained control over the high-dimensional noise space. Experiments demonstrate that our method preserves imperceptible visual fidelity while substantially increasing customization distortion in purified images—effectively thwarting malicious manipulation. Moreover, it exposes critical security vulnerabilities in mainstream purification strategies when applied within customized generation pipelines, establishing a new benchmark for safeguarding generative image integrity.
📝 Abstract
Diffusion models like Stable Diffusion have become prominent in visual synthesis tasks due to their powerful customization capabilities, which also introduce significant security risks, including deepfakes and copyright infringement. In response, a class of methods known as protective perturbation emerged, which mitigates image misuse by injecting imperceptible adversarial noise. However, purification can remove protective perturbations, thereby exposing images again to the risk of malicious forgery. In this work, we formalize the anti-purification task, highlighting challenges that hinder existing approaches, and propose a simple diagnostic protective perturbation named AntiPure. AntiPure exposes vulnerabilities of purification within the "purification-customization" workflow, owing to two guidance mechanisms: 1) Patch-wise Frequency Guidance, which reduces the model's influence over high-frequency components in the purified image, and 2) Erroneous Timestep Guidance, which disrupts the model's denoising strategy across different timesteps. With additional guidance, AntiPure embeds imperceptible perturbations that persist under representative purification settings, achieving effective post-customization distortion. Experiments show that, as a stress test for purification, AntiPure achieves minimal perceptual discrepancy and maximal distortion, outperforming other protective perturbation methods within the purification-customization workflow.