Closed-Form Concept Erasure via Double Projections

📅 2026-04-11
📈 Citations: 0
Influential: 0
📄 PDF

career value

226K/year
🤖 AI Summary
This work addresses the challenge of efficiently and safely removing specific concepts from generative models without affecting unrelated content. It proposes a training-free, closed-form linear transformation framework that achieves concept erasure through a two-step analytical projection: first computing a proxy projection of the target concept, then applying a constrained transformation within its left null space. As the first deterministic, geometrically interpretable, and non-iterative method for concept editing, it accomplishes erasure in just seconds on Stable Diffusion variants and FLUX models. The approach matches or exceeds state-of-the-art performance while significantly improving computational efficiency and better preserving the integrity of non-target concepts.

Technology Category

Application Category

📝 Abstract
While modern generative models such as diffusion-based architectures have enabled impressive creative capabilities, they also raise important safety and ethical risks. These concerns have led to growing interest in concept erasure, the process of removing unwanted concepts from model representations. Existing approaches often achieve strong erasure performance but rely on iterative optimization and may inadvertently distort unrelated concepts. In this work, we present a simple yet principled alternative: a linear transformation framework that achieves concept erasure analytically, without any training. Our method adapts a pretrained model through two sequential, closed-form steps: first, computing a proxy projection of the target concept, and second, applying a constrained transformation within the left null space of known concept directions. This design yields a deterministic and geometrically interpretable procedure for safe, efficient, and theory-grounded concept removal. Across a wide range of experiments, including object and style erasure on multiple Stable Diffusion variants and the flow-matching model (FLUX), our approach matches or surpasses the performance of state-of-the-art methods while preserving non-target concepts more faithfully. Requiring only a few seconds to apply, it offers a lightweight and drop-in tool for controlled model editing, advancing the goal of safer and more responsible generative models.
Problem

Research questions and friction points this paper is trying to address.

concept erasure
generative models
safety
ethical risks
model editing
Innovation

Methods, ideas, or system contributions that make the work stand out.

concept erasure
closed-form solution
linear transformation
null space projection
generative model safety
🔎 Similar Papers
2024-05-21Neural Information Processing SystemsCitations: 11