Where Concept Erasure Should Occur: Concept-Layer Alignment in Text-to-Video Diffusion Models

📅 2026-05-25

📈 Citations: 0

✨ Influential: 0

career value

162K/year

🤖 AI Summary

This work addresses the limitations of existing text-to-video diffusion models, where concept erasure is hindered by the uneven distribution of semantic information across model layers, leading to strong entanglement between target concepts and irrelevant signals. The authors propose CLEAR, a novel framework that reframes concept erasure as the problem of identifying the optimal alignment layer. By analyzing the natural separation depth between target concepts and non-target signals, CLEAR introduces a concept-layer topological alignment mechanism and designs a separability-aware objective function, enabling precise optimization within large-scale video diffusion Transformers. This approach overcomes the constraints of conventional heuristic or layer-agnostic erasure methods, significantly enhancing erasure accuracy and controllability while preserving high generation quality.

📝 Abstract

Text-to-video diffusion transformers encode semantic information unevenly across model depth, which constrains effective concept erasure. We identify a representational bottleneck, termed concept-layer topological alignment, under which target concepts exhibit higher separability at certain representational depths. Outside these depths, concept and non-target signals remain strongly entangled, limiting the effectiveness of depth-specific erasure. This observation reframes concept erasure as the problem of identifying representational depths where concept-non-target separation naturally emerges. Motivated by this structural constraint, we introduce CLEAR, a separability-driven optimization framework for concept erasure that explicitly enforces concept-layer alignment. CLEAR operationalizes this principle by formulating layer selection as an optimization problem over concept-non-target separability, rather than relying on layer-agnostic or heuristic choices. To enable this, we introduce a separability-aware objective that favors layers exhibiting stronger concept-non-target separation. Experiments on large-scale text-to-video diffusion models demonstrate that enforcing concept--layer alignment leads to more precise concept suppression while preserving overall generative quality.

Problem

Research questions and friction points this paper is trying to address.

concept erasure

text-to-video diffusion models

representational depth

concept-layer alignment

concept-non-target separability

Innovation

Methods, ideas, or system contributions that make the work stand out.

concept erasure

concept-layer alignment

diffusion transformers