🤖 AI Summary
Text-to-image diffusion models suffer from knowledge entrenchment, hindering selective forgetting—particularly for multi-concept erasure (e.g., copyrighted material, biases, or harmful concepts). Existing approaches face fundamental bottlenecks: incomplete forgetting, degraded generation fidelity, and poor training stability. To address this, we propose a controllable multi-concept forgetting framework. Our method introduces (1) a dynamic gradient masking mechanism for state-adaptive forgetting control; (2) a concept-aware loss function integrating superclass alignment and semantic consistency constraints; and (3) knowledge distillation regularization to ensure robustness under sequential forgetting. Extensive experiments demonstrate a 32% improvement in forgetting efficacy while significantly enhancing image fidelity and semantic coherence. To the best of our knowledge, this is the first work achieving stable, high-fidelity, and scalable controllable forgetting of multiple concepts in diffusion models.
📝 Abstract
Text-to-image (T2I) diffusion models have achieved remarkable success in generating high-quality images from textual prompts. However, their ability to store vast amounts of knowledge raises concerns in scenarios where selective forgetting is necessary, such as removing copyrighted content, reducing biases, or eliminating harmful concepts. While existing unlearning methods can remove certain concepts, they struggle with multi-concept forgetting due to instability, residual knowledge persistence, and generation quality degradation. To address these challenges, we propose extbf{Dynamic Mask coupled with Concept-Aware Loss}, a novel unlearning framework designed for multi-concept forgetting in diffusion models. Our extbf{Dynamic Mask} mechanism adaptively updates gradient masks based on current optimization states, allowing selective weight modifications that prevent interference with unrelated knowledge. Additionally, our extbf{Concept-Aware Loss} explicitly guides the unlearning process by enforcing semantic consistency through superclass alignment, while a regularization loss based on knowledge distillation ensures that previously unlearned concepts remain forgotten during sequential unlearning. We conduct extensive experiments to evaluate our approach. Results demonstrate that our method outperforms existing unlearning techniques in forgetting effectiveness, output fidelity, and semantic coherence, particularly in multi-concept scenarios. Our work provides a principled and flexible framework for stable and high-fidelity unlearning in generative models. The code will be released publicly.