Localizing Knowledge in Diffusion Transformers

📅 2025-05-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the problem of localizing semantic knowledge across layers in diffusion Transformers (DiTs), aiming to enhance generative models’ interpretability, controllability, and adaptability. We propose a model-agnostic, architecture-general knowledge localization method that, for the first time, systematically characterizes the cross-layer distribution of six semantic knowledge types in leading DiT models—including PixArt-α, FLUX, and SANA. Our approach integrates attention mechanism analysis, inter-layer knowledge attribution, causal intervention validation, and localized fine-tuning. It uncovers causal relationships between layer-specific knowledge and generation outcomes. Experiments demonstrate that our method significantly reduces fine-tuning overhead, improves downstream task performance, better preserves general-purpose capabilities, and minimizes interference with unrelated knowledge. These advantages establish a robust foundation for precise model editing—such as personalized adaptation and targeted knowledge forgetting—without compromising model integrity.

Technology Category

Application Category

📝 Abstract
Understanding how knowledge is distributed across the layers of generative models is crucial for improving interpretability, controllability, and adaptation. While prior work has explored knowledge localization in UNet-based architectures, Diffusion Transformer (DiT)-based models remain underexplored in this context. In this paper, we propose a model- and knowledge-agnostic method to localize where specific types of knowledge are encoded within the DiT blocks. We evaluate our method on state-of-the-art DiT-based models, including PixArt-alpha, FLUX, and SANA, across six diverse knowledge categories. We show that the identified blocks are both interpretable and causally linked to the expression of knowledge in generated outputs. Building on these insights, we apply our localization framework to two key applications: model personalization and knowledge unlearning. In both settings, our localized fine-tuning approach enables efficient and targeted updates, reducing computational cost, improving task-specific performance, and better preserving general model behavior with minimal interference to unrelated or surrounding content. Overall, our findings offer new insights into the internal structure of DiTs and introduce a practical pathway for more interpretable, efficient, and controllable model editing.
Problem

Research questions and friction points this paper is trying to address.

Localize knowledge in Diffusion Transformer layers
Improve interpretability and controllability of DiT models
Enable efficient model personalization and knowledge unlearning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Model-agnostic knowledge localization in DiT blocks
Interpretable causal links to knowledge expression
Localized fine-tuning for efficient model updates
🔎 Similar Papers
No similar papers found.