Mask-Guided Attention Regulation for Anatomically Consistent Counterfactual CXR Synthesis

πŸ“… 2026-03-04
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the challenges of anatomical structure drift and unstable lesion expression in counterfactual generation for chest X-ray (CXR) images using diffusion models, which stem from unconstrained attention mechanisms. To mitigate these issues, the authors propose an inference-stage attention regulation framework that leverages organ mask–guided self- and cross-attention modulation. The approach integrates anatomy-aware attention regularization, lesion-guided cross-attention enhancement, and a lightweight latent-space correction based on attention energy maps. This enables spatially constrained structural interactions and controllable lesion synthesis during the early denoising steps. Experimental results across multiple CXR datasets demonstrate significant improvements in anatomical consistency and lesion editing fidelity, thereby effectively supporting downstream tasks such as localized counterfactual analysis and data augmentation.

Technology Category

Application Category

πŸ“ Abstract
Counterfactual generation for chest X-rays (CXR) aims to simulate plausible pathological changes while preserving patient-specific anatomy. However, diffusion-based editing methods often suffer from structural drift, where stable anatomical semantics propagate globally through attention and distort non-target regions, and unstable pathology expression, since subtle and localized lesions induce weak and noisy conditioning signals. We present an inference-time attention regulation framework for reliable counterfactual CXR synthesis. An anatomy-aware attention regularization module gates self-attention and anatomy-token cross-attention with organ masks, confining structural interactions to anatomical ROIs and reducing unintended distortions. A pathology-guided module enhances pathology-token cross-attention within target lung regions during early denoising and performs lightweight latent corrections driven by an attention-concentration energy, enabling controllable lesion localization and extent. Extensive evaluations on CXR datasets show improved anatomical consistency and more precise, controllable pathological edits compared with standard diffusion editing, supporting localized counterfactual analysis and data augmentation for downstream tasks.
Problem

Research questions and friction points this paper is trying to address.

counterfactual generation
structural drift
pathology expression
anatomical consistency
chest X-ray
Innovation

Methods, ideas, or system contributions that make the work stand out.

attention regulation
anatomical consistency
counterfactual synthesis
diffusion models
mask-guided editing
πŸ”Ž Similar Papers
No similar papers found.