Combinational Backdoor Attack against Customized Text-to-Image Models

📅 2024-11-19

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

191K/year

🤖 AI Summary

Customized text-to-image (T2I) models often integrate third-party pre-trained components—such as text encoders and conditional diffusion models—introducing combinatorial backdoor risks wherein malicious behavior manifests only when specific contaminated modules co-occur. Method: This paper proposes the first module-coordinated backdoor attack: the backdoor activates exclusively when a compromised text encoder and a poisoned diffusion model are jointly deployed, enhancing stealth and attack selectivity. Built upon the Stable Diffusion architecture, it introduces trigger words and image perturbations, integrating module-level parameter poisoning with conditional alignment constraints. Contribution/Results: Evaluated across multiple triggers and target images, the attack achieves >92% success rate while preserving normal generation fidelity—PSNR degradation remains <1.5%. This work is the first to identify and empirically demonstrate a novel supply-chain-level security threat in T2I model assembly pipelines.

Technology Category

Application Category

📝 Abstract

Recently, Text-to-Image (T2I) synthesis technology has made tremendous strides. Numerous representative T2I models have emerged and achieved promising application outcomes, such as DALL-E, Stable Diffusion, Imagen, etc. In practice, it has become increasingly popular for model developers to selectively adopt various pre-trained text encoders and conditional diffusion models from third-party platforms, integrating them to build customized (personalized) T2I models. However, such an adoption approach is vulnerable to backdoor attacks. In this work, we propose a Combinational Backdoor Attack against Customized T2I models (CBACT2I) targeting this application scenario. Different from previous backdoor attacks against T2I models, CBACT2I embeds the backdoor into the text encoder and the conditional diffusion model separately. The customized T2I model exhibits backdoor behaviors only when the backdoor text encoder is used in combination with the backdoor conditional diffusion model. These properties make CBACT2I more stealthy and flexible than prior backdoor attacks against T2I models. Extensive experiments demonstrate the effectiveness of CBACT2I with different backdoor triggers and different backdoor targets on the open-sourced Stable Diffusion model. This work reveals the backdoor vulnerabilities of customized T2I models and urges countermeasures to mitigate backdoor threats in this scenario.

Problem

Research questions and friction points this paper is trying to address.

Proposes stealthy backdoor attack on customized text-to-image models

Embeds separate backdoors in text encoder and diffusion model components

Ensures attack triggers only when both compromised components are combined

Innovation

Methods, ideas, or system contributions that make the work stand out.

Embeds backdoors separately in text encoder and diffusion model

Activates only when both compromised components are combined

Enhances stealth and control compared to prior attacks

🔎 Similar Papers

No similar papers found.