DeLeaker: Dynamic Inference-Time Reweighting For Semantic Leakage Mitigation in Text-to-Image Models

📅 2025-10-16

📈 Citations: 0

✨ Influential: 0

career value

142K/year

🤖 AI Summary

Text-to-image (T2I) diffusion models suffer from semantic leakage—unintended cross-entity semantic associations arising from excessive attention-based interactions among distinct entities. To address this, we propose a lightweight, training-free inference-time intervention that dynamically reweights self-attention maps during the denoising process to suppress inter-entity semantic leakage. This is the first work to directly modulate attention mechanisms at inference time for mitigating semantic leakage. We introduce SLIM, the first dedicated benchmark dataset for semantic leakage evaluation, along with an automated assessment framework. Extensive experiments demonstrate that our method significantly outperforms existing baselines across diverse scenarios, effectively reducing semantic leakage while preserving image quality and fidelity—without requiring additional inputs or model fine-tuning.

Technology Category

Application Category

📝 Abstract

Text-to-Image (T2I) models have advanced rapidly, yet they remain vulnerable to semantic leakage, the unintended transfer of semantically related features between distinct entities. Existing mitigation strategies are often optimization-based or dependent on external inputs. We introduce DeLeaker, a lightweight, optimization-free inference-time approach that mitigates leakage by directly intervening on the model's attention maps. Throughout the diffusion process, DeLeaker dynamically reweights attention maps to suppress excessive cross-entity interactions while strengthening the identity of each entity. To support systematic evaluation, we introduce SLIM (Semantic Leakage in IMages), the first dataset dedicated to semantic leakage, comprising 1,130 human-verified samples spanning diverse scenarios, together with a novel automatic evaluation framework. Experiments demonstrate that DeLeaker consistently outperforms all baselines, even when they are provided with external information, achieving effective leakage mitigation without compromising fidelity or quality. These results underscore the value of attention control and pave the way for more semantically precise T2I models.

Problem

Research questions and friction points this paper is trying to address.

Mitigating unintended semantic feature transfer in text-to-image models

Reducing excessive cross-entity interactions during image generation

Improving semantic precision without compromising image quality

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic reweighting of attention maps during inference

Optimization-free approach mitigating semantic leakage

Suppressing cross-entity interactions while strengthening identity

🔎 Similar Papers

Single Image Unlearning: Efficient Machine Unlearning in Multimodal Large Language Models