Score Forgetting Distillation: A Swift, Data-Free Method for Machine Unlearning in Diffusion Models

📅 2024-09-17

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

177K/year

🤖 AI Summary

This work addresses the challenge of removing harmful content from diffusion models without access to the original training data, proposing a novel machine unlearning paradigm. Methodologically, it introduces the first score-based unlearning distillation framework, embedding unlearning objectives directly into score distillation via conditional score alignment to selectively erase targeted undesirable concepts. The approach integrates conditional score modeling, one-step synthesis generation, and a tailored regularization loss, jointly improving sampling efficiency while preserving generation fidelity. Experiments demonstrate that the method effectively eliminates specified harmful concepts across diverse conditional diffusion models—without degrading semantic fidelity of other attributes—and exhibits strong generalization across architectures and datasets. By enabling scalable, data-efficient unlearning, this work advances the safe and trustworthy deployment of generative AI systems.

Technology Category

Application Category

📝 Abstract

The machine learning community is increasingly recognizing the importance of fostering trust and safety in modern generative AI (GenAI) models. We posit machine unlearning (MU) as a crucial foundation for developing safe, secure, and trustworthy GenAI models. Traditional MU methods often rely on stringent assumptions and require access to real data. This paper introduces Score Forgetting Distillation (SFD), an innovative MU approach that promotes the forgetting of undesirable information in diffusion models by aligning the conditional scores of"unsafe"classes or concepts with those of"safe"ones. To eliminate the need for real data, our SFD framework incorporates a score-based MU loss into the score distillation objective of a pretrained diffusion model. This serves as a regularization term that preserves desired generation capabilities while enabling the production of synthetic data through a one-step generator. Our experiments on pretrained label-conditional and text-to-image diffusion models demonstrate that our method effectively accelerates the forgetting of target classes or concepts during generation, while preserving the quality of other classes or concepts. This unlearned and distilled diffusion not only pioneers a novel concept in MU but also accelerates the generation speed of diffusion models. Our experiments and studies on a range of diffusion models and datasets confirm that our approach is generalizable, effective, and advantageous for MU in diffusion models. Code is available at https://github.com/tqch/score-forgetting-distillation. ($ extbf{Warning:}$ This paper contains sexually explicit imagery, discussions of pornography, racially-charged terminology, and other content that some readers may find disturbing, distressing, and/or offensive.)

Problem

Research questions and friction points this paper is trying to address.

Develops a method for machine unlearning in diffusion models.

Eliminates need for real data in machine unlearning processes.

Enhances safety and trust in generative AI models.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Score Forgetting Distillation for machine unlearning

Aligns unsafe and safe class conditional scores

Enables synthetic data generation without real data

🔎 Similar Papers

Holistic Unlearning Benchmark: A Multi-Faceted Evaluation for Text-to-Image Diffusion Model Unlearning