BackdoorDM: A Comprehensive Benchmark for Backdoor Learning in Diffusion Model

📅 2025-02-17

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

The diffusion model (DM) community lacks a unified benchmark for evaluating backdoor attacks and defenses, significantly hindering progress in trustworthy generative AI. Method: We introduce the first comprehensive backdoor learning benchmark tailored for DMs, integrating nine state-of-the-art attacks, four defense methods, and interactive visualization tools. We propose a novel multi-target backdoor taxonomy covering three attack paradigms and five target types. Furthermore, we design a large-language-model–driven evaluation framework powered by GPT-4o to enable automated, reproducible, quantitative assessment. Contribution/Results: We publicly release the full benchmark pipeline—including code, configurations, and standardized evaluation protocols. Empirical analysis uncovers previously unrecognized characteristics of backdoors in DMs. This work establishes the first standardized infrastructure for rigorously evaluating backdoor robustness in diffusion models, advancing the development of secure and reliable generative AI systems.

Technology Category

Application Category

📝 Abstract

Backdoor learning is a critical research topic for understanding the vulnerabilities of deep neural networks. While it has been extensively studied in discriminative models over the past few years, backdoor learning in diffusion models (DMs) has recently attracted increasing attention, becoming a new research hotspot. Although many different backdoor attack and defense methods have been proposed for DMs, a comprehensive benchmark for backdoor learning in DMs is still lacking. This absence makes it difficult to conduct fair comparisons and thoroughly evaluate existing approaches, thus hindering future research progress. To address this issue, we propose BackdoorDM, the first comprehensive benchmark designed for backdoor learning in DMs. It comprises nine state-of-the-art (SOTA) attack methods, four SOTA defense strategies, and two helpful visualization analysis tools. We first systematically classify and formulate the existing literature in a unified framework, focusing on three different backdoor attack types and five backdoor target types, which are restricted to a single type in discriminative models. Then, we systematically summarize the evaluation metrics for each type and propose a unified backdoor evaluation method based on GPT-4o. Finally, we conduct a comprehensive evaluation and highlight several important conclusions. We believe that BackdoorDM will help overcome current barriers and contribute to building a trustworthy DMs community. The codes are released in https://github.com/linweiii/BackdoorDM.

Problem

Research questions and friction points this paper is trying to address.

Develops benchmark for backdoor learning

Evaluates attack and defense methods

Enhances trust in diffusion models

Innovation

Methods, ideas, or system contributions that make the work stand out.

BackdoorDM benchmark

GPT-4o evaluation method

Nine SOTA attack methods

🔎 Similar Papers

UFID: A Unified Framework for Input-level Backdoor Detection on Diffusion Models