Precise, Fast, and Low-cost Concept Erasure in Value Space: Orthogonal Complement Matters

📅 2024-12-09

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

164K/year

🤖 AI Summary

To address the need for efficient erasure of unsafe concepts (e.g., copyrighted or sensitive content) in diffusion models, this paper proposes AdaVD—a training-free method that performs adaptive orthogonal complement projection in the value space of UNet’s cross-attention layers to achieve precise, fast, and low-cost concept removal. Key contributions include: (i) the first orthogonal complement-based erasure mechanism operating directly in the attention value space; (ii) a learnable offset factor that dynamically balances erasure strength and prior fidelity; and (iii) support for both single- and multi-concept removal without fine-tuning. Experiments demonstrate that AdaVD achieves state-of-the-art (SOTA) or near-SOTA erasure performance, while improving prior preservation by 2–10× over the second-best method. Moreover, AdaVD is compatible with mainstream diffusion architectures and downstream generative tasks.

Technology Category

Application Category

📝 Abstract

Recent success of text-to-image (T2I) generation and its increasing practical applications, enabled by diffusion models, require urgent consideration of erasing unwanted concepts, e.g., copyrighted, offensive, and unsafe ones, from the pre-trained models in a precise, timely, and low-cost manner. The twofold demand of concept erasure includes not only a precise removal of the target concept (i.e., erasure efficacy) but also a minimal change on non-target content (i.e., prior preservation), during generation. Existing methods face challenges in maintaining an effective balance between erasure efficacy and prior preservation, and they can be computationally costly. To improve, we propose a precise, fast, and low-cost concept erasure method, called Adaptive Value Decomposer (AdaVD), which is training-free. Our method is grounded in a classical linear algebraic operation of computing the orthogonal complement, implemented in the value space of each cross-attention layer within the UNet of diffusion models. We design a shift factor to adaptively navigate the erasure strength, enhancing effective prior preservation without sacrificing erasure efficacy. Extensive comparative experiments with both training-based and training-free state-of-the-art methods demonstrate that the proposed AdaVD excels in both single and multiple concept erasure, showing 2 to 10 times improvement in prior preservation than the second best, meanwhile achieving the best or near best erasure efficacy. AdaVD supports a series of diffusion models and downstream image generation tasks, with code available on: https://github.com/WYuan1001/AdaVD.

Problem

Research questions and friction points this paper is trying to address.

Erasing unwanted concepts from pre-trained models precisely

Balancing erasure efficacy and minimal non-target content change

Achieving fast, low-cost concept erasure without training

Innovation

Methods, ideas, or system contributions that make the work stand out.

Training-free Adaptive Value Decomposer (AdaVD)

Orthogonal complement in value space

Adaptive shift factor for erasure strength

🔎 Similar Papers

No similar papers found.