TINA: Text-Free Inversion Attack for Unlearned Text-to-Image Diffusion Models

📅 2026-03-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current concept erasure methods for text-to-image diffusion models merely disrupt the text-to-image mapping without genuinely removing the underlying visual knowledge, leaving the models susceptible to generating harmful content. This work proposes a text-free inversion attack based on the DDIM inversion framework, employing an empty-text conditioning strategy augmented with an optimization mechanism to mitigate error accumulation in the absence of textual guidance. The method achieves, for the first time, fully text-agnostic inversion and successfully reconstructs erased concepts from models processed by state-of-the-art unlearning techniques. These findings expose a fundamental flaw in existing erasure approaches: they mask rather than eliminate visual knowledge embedded within the model.

Technology Category

Application Category

📝 Abstract
Although text-to-image diffusion models exhibit remarkable generative power, concept erasure techniques are essential for their safe deployment to prevent the creation of harmful content. This has fostered a dynamic interplay between the development of erasure defenses and the adversarial probes designed to bypass them, and this co-evolution has progressively enhanced the efficacy of erasure methods. However, this adversarial co-evolution has converged on a narrow, text-centric paradigm that equates erasure with severing the text-to-image mapping, ignoring that the underlying visual knowledge related to undesired concepts still persist. To substantiate this claim, we investigate from a visual perspective, leveraging DDIM inversion to probe whether a generative pathway for the erased concept can still be found. However, identifying such a visual generative pathway is challenging because standard text-guided DDIM inversion is actively resisted by text-centric defenses within the erased model. To address this, we introduce TINA, a novel Text-free INversion Attack, which enforces this visual-only probe by operating under a null-text condition, thereby avoiding existing text-centric defenses. Moreover, TINA integrates an optimization procedure to overcome the accumulating approximation errors that arise when standard inversion operates without its usual textual guidance. Our experiments demonstrate that TINA regenerates erased concepts from models treated with state-of-the-art unlearning. The success of TINA proves that current methods merely obscure concepts, highlighting an urgent need for paradigms that operate directly on internal visual knowledge.
Problem

Research questions and friction points this paper is trying to address.

concept erasure
text-to-image diffusion models
visual knowledge
unlearning
adversarial attack
Innovation

Methods, ideas, or system contributions that make the work stand out.

text-free inversion
concept erasure
diffusion models
adversarial attack
visual knowledge
Q
Qianlong Xiang
Harbin Institute of Technology (Shenzhen)
M
Miao Zhang
Harbin Institute of Technology (Shenzhen)
Haoyu Zhang
Haoyu Zhang
Harbin Institute of Technology (Shenzhen)
Spatial UnderstandingEgocentric VisionMultimodal Analysis
K
Kun Wang
Shandong University
Junhui Hou
Junhui Hou
Department of Computer Science, City University of Hong Kong
Neural Spatial Computing
L
Liqiang Nie
Harbin Institute of Technology (Shenzhen)