Combined Dictionary Unfolding Network with Gradient-Adaptive Fidelity for Transferable Multi-Source Fusion

📅 2026-05-01

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

This work addresses the high computational and memory costs of existing deep unrolling fusion methods, which rely on alternating minimization to separately update multimodal features and are thus unsuitable for edge deployment. To overcome this limitation, the authors propose CDNet, a lightweight joint dictionary unrolling network that embeds the commonality-specificity decomposition prior from coupled dictionary learning into a structurally constrained joint unrolling architecture for efficient multi-source image fusion. The key innovations include the first use of a block-sparse interaction topology in the proposed CDBlock to jointly optimize shared and modality-specific representations, along with a high-low frequency adaptive fidelity loss enabling unsupervised training. Evaluated on four benchmark tasks including TNO and RoadScene, CDNet significantly outperforms state-of-the-art methods, achieving PSNR gains of 1.23 dB and 1.59 dB, respectively.

📝 Abstract

Deep Unfolding Network-based methods have emerged as effective solutions for multi-source image fusion by combining model-driven iterative optimization with data-driven deep learning. However, most existing deep unfolding image fusion methods are derived from alternating minimization, which updates the features of different modalities separately. This design introduces considerable computational and memory overhead, limiting deployment on resource-constrained edge devices. To address this issue, we propose CDNet, a lightweight Combined Dictionary Unfolding Network for multi-source image fusion. Rather than introducing a new sparse coding prior or empirically compressing an existing fusion network, CDNet translates the unique-common decomposition prior of coupled dictionary learning into a structurally constrained joint unfolding architecture. The resulting CDBlock follows a block-sparse interaction topology and performs a model-derived joint update of common and modality-specific representations, thereby streamlining feature learning and improving efficiency.In addition, we design a compact High- and Low-frequency Image Fidelity loss for unsupervised training without ground-truth images. We evaluate CDNet on four tasks, including multi-exposure image fusion, infrared and visible image fusion, medical image fusion, and infrared and visible image fusion for semantic segmentation. Experimental results show that CDNet achieves competitive or superior fusion performance with high efficiency. For infrared and visible image fusion, CDNet outperforms competing methods on four of six metrics on the TNO dataset and five of six metrics on the RoadScene dataset. In particular, it surpasses the second-best method by 1.23 dB and 1.59 dB in PSNR on TNO and RoadScene, respectively.

Problem

Research questions and friction points this paper is trying to address.

multi-source image fusion

deep unfolding network

computational overhead

memory overhead

edge deployment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Combined Dictionary Unfolding

Joint Feature Update

Block-Sparse Interaction