SuMa: A Subspace Mapping Approach for Robust and Effective Concept Erasure in Text-to-Image Diffusion Models

📅 2025-09-06

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

Existing concept erasure methods for text-to-image diffusion models exhibit insufficient robustness and degrade image fidelity when removing narrow-domain concepts—such as celebrities or copyrighted characters—primarily due to their semantically proximal and boundary-ambiguous representations. To address this, we propose subspace-mapping erasure: first constructing a feature subspace characterizing the target concept in the diffusion latent space, then orthogonally projecting it onto an unbiased reference subspace via distance minimization, enabling fine-grained, directionally constrained concept suppression. Our approach explicitly models semantic structure within the diffusion latent space, balancing erasure completeness and generation fidelity. Experiments across four erasure benchmarks demonstrate that our method matches state-of-the-art performance: achieving SOTA-level erasure effectiveness while preserving image quality comparable to fidelity-prioritized approaches—and substantially outperforming existing robustness-focused alternatives.

Technology Category

Application Category

📝 Abstract

The rapid growth of text-to-image diffusion models has raised concerns about their potential misuse in generating harmful or unauthorized contents. To address these issues, several Concept Erasure methods have been proposed. However, most of them fail to achieve both robustness, i.e., the ability to robustly remove the target concept., and effectiveness, i.e., maintaining image quality. While few recent techniques successfully achieve these goals for NSFW concepts, none could handle narrow concepts such as copyrighted characters or celebrities. Erasing these narrow concepts is critical in addressing copyright and legal concerns. However, erasing them is challenging due to their close distances to non-target neighboring concepts, requiring finer-grained manipulation. In this paper, we introduce Subspace Mapping (SuMa), a novel method specifically designed to achieve both robustness and effectiveness in easing these narrow concepts. SuMa first derives a target subspace representing the concept to be erased and then neutralizes it by mapping it to a reference subspace that minimizes the distance between the two. This mapping ensures the target concept is robustly erased while preserving image quality. We conduct extensive experiments with SuMa across four tasks: subclass erasure, celebrity erasure, artistic style erasure, and instance erasure and compare the results with current state-of-the-art methods. Our method achieves image quality comparable to approaches focused on effectiveness, while also yielding results that are on par with methods targeting completeness.

Problem

Research questions and friction points this paper is trying to address.

Erasing narrow copyrighted concepts from diffusion models

Maintaining robustness and effectiveness in concept removal

Preventing misuse while preserving generated image quality

Innovation

Methods, ideas, or system contributions that make the work stand out.

Subspace mapping for concept erasure

Neutralizes target to reference subspace

Robust removal while preserving quality

🔎 Similar Papers

Hiding and Recovering Knowledge in Text-to-Image Diffusion Models via Learnable Prompts