🤖 AI Summary
This work addresses the challenge of efficiently and non-destructively removing specific concepts—such as inappropriate content or copyrighted characters/styles—from text-to-image diffusion models. The authors propose a training-free, inference-time concept erasure method that introduces, for the first time, a dual-energy guidance mechanism. This approach jointly optimizes two energy functions in the latent space: one repelling the target concept and another preserving the semantics of the original prompt, enabling plug-and-play concept removal. Experimental results demonstrate that the method significantly enhances concept erasure efficacy across multiple baselines while maintaining high image quality, strong prompt alignment, and robustness against adversarial attacks.
📝 Abstract
As text-to-image diffusion models grow increasingly prevalent, the ability to remove specific concepts-mostly explicit content and many copyrighted characters or styles-has become essential for safety and compliance. Existing unlearning approaches often require costly re-training, modify parameters at the cost of degradation of unrelated concept fidelity, or depend on indirect inference-time adjustment that compromise the effectiveness of concept erasure. Inspired by the success of energy-guided sampling for preservation of the condition of diffusion models, we introduce Energy-Guided Latent Optimization for Concept Erasure (EGLOCE), a training-free approach that removes unwanted concepts by re-directing noisy latent during inference. Our method employs a dual-objective framework: a repulsion energy that steers generation away from target concepts via gradient descent in latent space, and a retention energy that preserves semantic alignment to the original prompt. Combined with previous approaches that either require erroneous modified model weights or provide weak inference-time guidance, EGLOCE operates entirely at inference and enhances erasure performance, enabling plug-and-play integration. Extensive experiments demonstrate that EGLOCE improves concept removal while maintaining image quality and prompt alignment across baselines, even with adversarial attacks. To the best of our knowledge, our work is the first to establish a new paradigm for safe and controllable image generation through dual energy-based guidance during sampling.