🤖 AI Summary
Diffusion model fine-tuning and zero-shot customization techniques significantly exacerbate facial privacy leakage, yet existing defenses primarily target fine-tuning attacks and lack efficacy against zero-shot threats. To address this gap, we propose DLADiff, a dual-layer defense framework—the first to jointly mitigate both fine-tuned and zero-shot customization attacks. DLADiff employs a Dual-Subspace Unraveling Representation (DSUR) model to disentangle identity-specific features, Alternating Dynamic Fine-Tuning (ADFT) to enhance robustness against adaptive adversaries, and a prior-knowledge-guided lightweight adversarial training mechanism to improve generalization. Extensive experiments demonstrate that DLADiff outperforms state-of-the-art methods in defending against mainstream fine-tuning attacks (e.g., LoRA), while achieving the first practical-level protection against zero-shot generation—striking an optimal balance between privacy security and generation fidelity.
📝 Abstract
With the rapid advancement of diffusion models, a variety of fine-tuning methods have been developed, enabling high-fidelity image generation with high similarity to the target content using only 3 to 5 training images. More recently, zero-shot generation methods have emerged, capable of producing highly realistic outputs from a single reference image without altering model weights. However, technological advancements have also introduced significant risks to facial privacy. Malicious actors can exploit diffusion model customization with just a few or even one image of a person to create synthetic identities nearly identical to the original identity. Although research has begun to focus on defending against diffusion model customization, most existing defense methods target fine-tuning approaches and neglect zero-shot generation defenses. To address this issue, this paper proposes Dual-Layer Anti-Diffusion (DLADiff) to defense both fine-tuning methods and zero-shot methods. DLADiff contains a dual-layer protective mechanism. The first layer provides effective protection against unauthorized fine-tuning by leveraging the proposed Dual-Surrogate Models (DSUR) mechanism and Alternating Dynamic Fine-Tuning (ADFT), which integrates adversarial training with the prior knowledge derived from pre-fine-tuned models. The second layer, though simple in design, demonstrates strong effectiveness in preventing image generation through zero-shot methods. Extensive experimental results demonstrate that our method significantly outperforms existing approaches in defending against fine-tuning of diffusion models and achieves unprecedented performance in protecting against zero-shot generation.