DLADiff: A Dual-Layer Defense Framework against Fine-Tuning and Zero-Shot Customization of Diffusion Models

📅 2025-11-24

📈 Citations: 0

✨ Influential: 0

career value

232K/year

🤖 AI Summary

Diffusion model fine-tuning and zero-shot customization techniques significantly exacerbate facial privacy leakage, yet existing defenses primarily target fine-tuning attacks and lack efficacy against zero-shot threats. To address this gap, we propose DLADiff, a dual-layer defense framework—the first to jointly mitigate both fine-tuned and zero-shot customization attacks. DLADiff employs a Dual-Subspace Unraveling Representation (DSUR) model to disentangle identity-specific features, Alternating Dynamic Fine-Tuning (ADFT) to enhance robustness against adaptive adversaries, and a prior-knowledge-guided lightweight adversarial training mechanism to improve generalization. Extensive experiments demonstrate that DLADiff outperforms state-of-the-art methods in defending against mainstream fine-tuning attacks (e.g., LoRA), while achieving the first practical-level protection against zero-shot generation—striking an optimal balance between privacy security and generation fidelity.

Technology Category

Application Category

📝 Abstract

With the rapid advancement of diffusion models, a variety of fine-tuning methods have been developed, enabling high-fidelity image generation with high similarity to the target content using only 3 to 5 training images. More recently, zero-shot generation methods have emerged, capable of producing highly realistic outputs from a single reference image without altering model weights. However, technological advancements have also introduced significant risks to facial privacy. Malicious actors can exploit diffusion model customization with just a few or even one image of a person to create synthetic identities nearly identical to the original identity. Although research has begun to focus on defending against diffusion model customization, most existing defense methods target fine-tuning approaches and neglect zero-shot generation defenses. To address this issue, this paper proposes Dual-Layer Anti-Diffusion (DLADiff) to defense both fine-tuning methods and zero-shot methods. DLADiff contains a dual-layer protective mechanism. The first layer provides effective protection against unauthorized fine-tuning by leveraging the proposed Dual-Surrogate Models (DSUR) mechanism and Alternating Dynamic Fine-Tuning (ADFT), which integrates adversarial training with the prior knowledge derived from pre-fine-tuned models. The second layer, though simple in design, demonstrates strong effectiveness in preventing image generation through zero-shot methods. Extensive experimental results demonstrate that our method significantly outperforms existing approaches in defending against fine-tuning of diffusion models and achieves unprecedented performance in protecting against zero-shot generation.

Problem

Research questions and friction points this paper is trying to address.

Defending facial privacy against unauthorized diffusion model fine-tuning attacks

Protecting identities from zero-shot customization using single reference images

Addressing limitations of existing defenses that neglect zero-shot generation risks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-layer protection against fine-tuning and zero-shot attacks

DSUR mechanism with adversarial training from pre-fine-tuned models

Simple yet effective second layer prevents zero-shot generation

🔎 Similar Papers

DiffuseDef: Improved Robustness to Adversarial Attacks