Mitigating Hallucinations in Diffusion Models through Adaptive Attention Modulation

📅 2025-02-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Hallucinations—i.e., generation of implausible details inconsistent with the training data distribution—are pervasive in diffusion model synthesis. To address this at the architectural level, we propose Adaptive Attention Modulation (AAM), comprising two core mechanisms: (1) temperature-scaled embedding softmax, a novel technique that dynamically sharpens or softens self-attention distributions during early denoising steps; and (2) noise-masked perturbation, which selectively disrupts erroneous feature propagation paths at critical timesteps. AAM requires no additional training, fine-tuning, or annotated data, enabling plug-and-play integration into existing diffusion pipelines. Evaluated on the Hands dataset, AAM achieves a 20.8% reduction in Fréchet Inception Distance (FID) and an absolute decrease of 12.9% in the hallucination rate—demonstrating substantial improvements in both fidelity and reliability of generated outputs.

Technology Category

Application Category

📝 Abstract
Diffusion models, while increasingly adept at generating realistic images, are notably hindered by hallucinations -- unrealistic or incorrect features inconsistent with the trained data distribution. In this work, we propose Adaptive Attention Modulation (AAM), a novel approach to mitigate hallucinations by analyzing and modulating the self-attention mechanism in diffusion models. We hypothesize that self-attention during early denoising steps may inadvertently amplify or suppress features, contributing to hallucinations. To counter this, AAM introduces a temperature scaling mechanism within the softmax operation of the self-attention layers, dynamically modulating the attention distribution during inference. Additionally, AAM employs a masked perturbation technique to disrupt early-stage noise that may otherwise propagate into later stages as hallucinations. Extensive experiments demonstrate that AAM effectively reduces hallucinatory artifacts, enhancing both the fidelity and reliability of generated images. For instance, the proposed approach improves the FID score by 20.8% and reduces the percentage of hallucinated images by 12.9% (in absolute terms) on the Hands dataset.
Problem

Research questions and friction points this paper is trying to address.

Mitigates hallucinations in diffusion models
Modulates self-attention mechanism dynamically
Enhances image fidelity and reliability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive Attention Modulation reduces hallucinations.
Temperature scaling modulates self-attention dynamically.
Masked perturbation disrupts early-stage noise propagation.
🔎 Similar Papers
No similar papers found.