A Unified Framework for Diffusion Model Unlearning with f-Divergence

📅 2025-09-25

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

Existing MSE-based machine unlearning methods for text-to-image diffusion models represent only a special case of the f-divergence framework, lacking systematic modeling of the relationship between divergence selection and unlearning performance. Method: We propose the first unified unlearning framework grounded in the f-divergence family (e.g., KL, JS, χ²), which models the discrepancy between the target concept’s output distribution and an anchor concept’s distribution as a configurable f-divergence, and integrates it into the reverse denoising process of diffusion models to enable controllable knowledge erasure. Contribution/Results: Theoretical analysis characterizes how different f-divergences affect convergence speed and unlearning quality. Experiments on mainstream T2I models—including Stable Diffusion—demonstrate that our method significantly improves unlearning strength (+23.6%) and concept isolation (+18.4%) while preserving generation fidelity, enabling customizable, demand-driven unlearning strategies.

Technology Category

Application Category

📝 Abstract

Machine unlearning aims to remove specific knowledge from a trained model. While diffusion models (DMs) have shown remarkable generative capabilities, existing unlearning methods for text-to-image (T2I) models often rely on minimizing the mean squared error (MSE) between the output distribution of a target and an anchor concept. We show that this MSE-based approach is a special case of a unified $f$-divergence-based framework, in which any $f$-divergence can be utilized. We analyze the benefits of using different $f$-divergences, that mainly impact the convergence properties of the algorithm and the quality of unlearning. The proposed unified framework offers a flexible paradigm that allows to select the optimal divergence for a specific application, balancing different trade-offs between aggressive unlearning and concept preservation.

Problem

Research questions and friction points this paper is trying to address.

Removing specific knowledge from trained diffusion models

Unifying unlearning methods under f-divergence framework

Balancing aggressive unlearning with concept preservation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified f-divergence framework for unlearning

Flexible paradigm selecting optimal divergence

Balancing aggressive unlearning and preservation

🔎 Similar Papers

Holistic Unlearning Benchmark: A Multi-Faceted Evaluation for Text-to-Image Diffusion Model Unlearning