Dig2DIG: Dig into Diffusion Information Gains for Image Fusion

📅 2025-03-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing diffusion-based image fusion methods rely on predefined modality guidance, failing to model the dynamic variation of modality importance and lacking theoretical guarantees. This work first reveals the spatiotemporal imbalance in information gain during denoising and introduces Diffusion Information Gain (DIG)—a novel concept quantifying step-wise modality-specific information acquisition. We propose the first falsifiable dynamic fusion framework, theoretically grounded to reduce the upper bound of generalization error. Our method enables step-level dynamic quantification and adaptive fusion of modality contributions via variational inference, information-theoretic metrics, dynamic weight scheduling, and multi-stage feature alignment. Evaluated across diverse fusion scenarios, it achieves +2.1 dB PSNR and +0.032 SSIM improvements over state-of-the-art diffusion fusion approaches, while accelerating inference by 37%.

Technology Category

Application Category

📝 Abstract
Image fusion integrates complementary information from multi-source images to generate more informative results. Recently, the diffusion model, which demonstrates unprecedented generative potential, has been explored in image fusion. However, these approaches typically incorporate predefined multimodal guidance into diffusion, failing to capture the dynamically changing significance of each modality, while lacking theoretical guarantees. To address this issue, we reveal a significant spatio-temporal imbalance in image denoising; specifically, the diffusion model produces dynamic information gains in different image regions with denoising steps. Based on this observation, we Dig into the Diffusion Information Gains (Dig2DIG) and theoretically derive a diffusion-based dynamic image fusion framework that provably reduces the upper bound of the generalization error. Accordingly, we introduce diffusion information gains (DIG) to quantify the information contribution of each modality at different denoising steps, thereby providing dynamic guidance during the fusion process. Extensive experiments on multiple fusion scenarios confirm that our method outperforms existing diffusion-based approaches in terms of both fusion quality and inference efficiency.
Problem

Research questions and friction points this paper is trying to address.

Dynamic modality significance in diffusion-based image fusion
Theoretical guarantees for diffusion model generalization error
Quantifying information contribution across denoising steps
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic information gains guide fusion process
Theoretical framework reduces generalization error
Quantifies modality contributions at denoising steps
🔎 Similar Papers
No similar papers found.