Erasure or Erosion? Evaluating Compositional Degradation in Unlearned Text-To-Image Diffusion Models

📅 2026-04-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the lack of systematic evaluation regarding the impact of posterior unlearning methods on compositional generation capabilities when removing undesirable concepts—such as explicit content—from text-to-image diffusion models. Focusing on Stable Diffusion 1.4, the work presents the first empirical analysis of state-of-the-art unlearning approaches through the lens of compositional generation, leveraging benchmarks including T2I-CompBench++, GenEval, and established unlearning evaluations. The investigation specifically examines dimensions such as attribute binding, spatial reasoning, and counting. Results reveal a pervasive trade-off between effective concept removal and preservation of compositional integrity: aggressive unlearning significantly degrades compositional abilities, whereas methods preserving semantic structure often fail to sufficiently suppress the target concept. This highlights a fundamental limitation in current unlearning paradigms in simultaneously achieving targeted suppression and semantic fidelity.
📝 Abstract
Post-hoc unlearning has emerged as a practical mechanism for removing undesirable concepts from large text-to-image diffusion models. However, prior work primarily evaluates unlearning through erasure success; its impact on broader generative capabilities remains poorly understood. In this work, we conduct a systematic empirical study of concept unlearning through the lens of compositional text-to-image generation. Focusing on nudity removal in Stable Diffusion 1.4, we evaluate a diverse set of state-of-the-art unlearning methods using T2I-CompBench++ and GenEval, alongside established unlearning benchmarks. Our results reveal a consistent trade-off between unlearning effectiveness and compositional integrity: methods that achieve strong erasure frequently incur substantial degradation in attribute binding, spatial reasoning, and counting. Conversely, approaches that preserve compositional structure often fail to provide robust erasure. These findings highlight limitations of current evaluation practices and underscore the need for unlearning objectives that explicitly account for semantic preservation beyond targeted suppression.
Problem

Research questions and friction points this paper is trying to address.

unlearning
text-to-image diffusion models
compositional generation
concept erasure
semantic preservation
Innovation

Methods, ideas, or system contributions that make the work stand out.

concept unlearning
compositional generation
text-to-image diffusion models
semantic preservation
erasure-evaluation trade-off
A
Arian Komaei Koma
Department of Computer Engineering, Sharif University of Technology
S
Seyed Amir Kasaei
Department of Computer Engineering, Sharif University of Technology
A
Ali Aghayari
Department of Computer Engineering, Sharif University of Technology
A
AmirMahdi Sadeghzadeh
Department of Computer Engineering, Sharif University of Technology
Mohammad Hossein Rohban
Mohammad Hossein Rohban
Associate Professor in Computer Engineering, Sharif University of Technology
Machine LearningStatisticsComputational Biology