🤖 AI Summary
Machine unlearning in generative adversarial networks (GANs) faces challenges in selectively forgetting specific semantic concepts—e.g., identity, expression, or attributes—without compromising model integrity or data privacy.
Method: We propose Text-to-Unlearn, the first text-guided cross-modal content unloading framework for GANs. It enables fine-grained, controllable concept forgetting solely via natural language prompts—requiring no auxiliary data, fine-tuning, or retraining. Key techniques include CLIP-aligned text-guided gradient reweighting, latent-space semantic disentanglement optimization, and an adaptive vision-language alignment evaluation mechanism.
Results: On face GANs, Text-to-Unlearn achieves 92.3% identity unlearning rate and 86.7% expression removal accuracy; generated outputs exhibit zero correlation with target prompts, while original image fidelity degrades by only 2.1%. This work establishes the first paradigm for text-driven, concept-level controllable unlearning in generative models, advancing regulatory-compliant AI generation.
📝 Abstract
State-of-the-art generative models exhibit powerful image-generation capabilities, introducing various ethical and legal challenges to service providers hosting these models. Consequently, Content Removal Techniques (CRTs) have emerged as a growing area of research to control outputs without full-scale retraining. Recent work has explored the use of Machine Unlearning in generative models to address content removal. However, the focus of such research has been on diffusion models, and unlearning in Generative Adversarial Networks (GANs) has remained largely unexplored. We address this gap by proposing Text-to-Unlearn, a novel framework that selectively unlearns concepts from pre-trained GANs using only text prompts, enabling feature unlearning, identity unlearning, and fine-grained tasks like expression and multi-attribute removal in models trained on human faces. Leveraging natural language descriptions, our approach guides the unlearning process without requiring additional datasets or supervised fine-tuning, offering a scalable and efficient solution. To evaluate its effectiveness, we introduce an automatic unlearning assessment method adapted from state-of-the-art image-text alignment metrics, providing a comprehensive analysis of the unlearning methodology. To our knowledge, Text-to-Unlearn is the first cross-modal unlearning framework for GANs, representing a flexible and efficient advancement in managing generative model behavior.