Forgetting is Competition: Rethinking Unlearning as Representation Interference in Diffusion Models

📅 2026-03-01

📈 Citations: 0

✨ Influential: 0

career value

155K/year

🤖 AI Summary

Existing unlearning methods for text-to-image diffusion models often fail to precisely remove target concepts and inadvertently degrade unrelated generative capabilities, falling short of requirements such as copyright compliance. This work proposes SurgUn, the first approach to incorporate retroactive interference theory into diffusion model unlearning. By introducing targeted weight-space updates, SurgUn induces competition between the target concept and newly learned content along shared representational pathways, enabling precise forgetting. The method is compatible with mainstream architectures including U-Net and Diffusion Transformer, and demonstrates consistent efficacy in removing specific visual concepts across Stable Diffusion v1.5, SDXL, and SANA, while significantly preserving general generation capabilities—thereby validating its generality and scalability.

Technology Category

Application Category

📝 Abstract

Unlearning in text-to-image diffusion models often leads to uneven concept removal and unintended forgetting of unrelated capabilities. This complicates tasks such as copyright compliance, protected data mitigation, artist opt-outs, and policy-driven content updates. As models grow larger and adopt more diverse architectures, achieving precise and selective unlearning while preserving generative quality becomes increasingly challenging. We introduce SurgUn (pronounced as Surgeon), a surgical unlearning method that applies targeted weight-space updates to remove specific visual concepts in text-conditioned diffusion models. Our approach is motivated by retroactive interference theory, which holds that newly acquired memories can overwrite, suppress, or impede access to prior ones by competing for shared representational pathways. We adapt this principle to diffusion models by inducing retroactive concept interference, enabling focused destabilization of only the target concept while preserving unrelated capabilities through a novel training paradigm. SurgUn achieves high-precision unlearning across diverse settings. It performs strongly on compact U-Net based models such as Stable Diffusion v1.5, scales effectively to the larger U-Net architecture SDXL, and extends to SANA, representing an underexplored Diffusion Transformer based architecture for unlearning.

Problem

Research questions and friction points this paper is trying to address.

unlearning

diffusion models

representation interference

concept removal

generative quality

Innovation

Methods, ideas, or system contributions that make the work stand out.

surgical unlearning

retroactive interference

representation competition