🤖 AI Summary
To address the challenge of mitigating harmful content in text-to-image generation, existing concept erasure methods suffer from visual artifacts and reliance on manually selected anchor concepts. This paper proposes ANT, a novel framework that introduces, for the first time, a trajectory-aware inverse conditional guidance objective within classifier-free guidance to enable automatic avoidance of harmful concepts. We design an enhanced weight saliency map for precise single-concept erasure and develop a parameter-free, plug-and-play mechanism for multi-concept collaborative erasure. ANT integrates conditional guidance analysis of diffusion models, trajectory-aware loss optimization, parameter-level saliency assessment, and data-augmentation-driven localization of sensitive parameters. Experiments demonstrate that ANT achieves state-of-the-art performance on both single- and multi-concept erasure tasks, effectively suppressing harmful content generation while preserving high image fidelity and structural integrity without introducing noticeable artifacts.
📝 Abstract
Ensuring the ethical deployment of text-to-image models requires effective techniques to prevent the generation of harmful or inappropriate content. While concept erasure methods offer a promising solution, existing finetuning-based approaches suffer from notable limitations. Anchor-free methods risk disrupting sampling trajectories, leading to visual artifacts, while anchor-based methods rely on the heuristic selection of anchor concepts. To overcome these shortcomings, we introduce a finetuning framework, dubbed ANT, which Automatically guides deNoising Trajectories to avoid unwanted concepts. ANT is built on a key insight: reversing the condition direction of classifier-free guidance during mid-to-late denoising stages enables precise content modification without sacrificing early-stage structural integrity. This inspires a trajectory-aware objective that preserves the integrity of the early-stage score function field, which steers samples toward the natural image manifold, without relying on heuristic anchor concept selection. For single-concept erasure, we propose an augmentation-enhanced weight saliency map to precisely identify the critical parameters that most significantly contribute to the unwanted concept, enabling more thorough and efficient erasure. For multi-concept erasure, our objective function offers a versatile plug-and-play solution that significantly boosts performance. Extensive experiments demonstrate that ANT achieves state-of-the-art results in both single and multi-concept erasure, delivering high-quality, safe outputs without compromising the generative fidelity. Code is available at https://github.com/lileyang1210/ANT