UniFusion: A Unified Image Fusion Framework with Robust Representation and Source-Aware Preservation

📅 2026-03-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes the first unified framework for multimodal image fusion, addressing the limitation of existing methods that are typically confined to specific tasks and struggle to balance source fidelity and fusion quality in cross-task scenarios. The framework leverages DINOv3-derived semantic features to construct a shared semantic space, incorporates a reconstruction alignment loss to enhance source-aware consistency, and employs a two-level optimization strategy to jointly improve fusion performance. Extensive experiments demonstrate that the proposed method achieves superior visual quality, strong generalization across diverse fusion tasks—including multi-exposure, multi-focus, and multimodal image fusion—and robust adaptability in real-world settings.

Technology Category

Application Category

📝 Abstract
Image fusion aims to integrate complementary information from multiple source images to produce a more informative and visually consistent representation, benefiting both human perception and downstream vision tasks. Despite recent progress, most existing fusion methods are designed for specific tasks (i.e., multi-modal, multi-exposure, or multi-focus fusion) and struggle to effectively preserve source information during the fusion process. This limitation primarily arises from task-specific architectures and the degradation of source information caused by deep-layer propagation. To overcome these issues, we propose UniFusion, a unified image fusion framework designed to achieve cross-task generalization. First, leveraging DINOv3 for modality-consistent feature extraction, UniFusion establishes a shared semantic space for diverse inputs. Second, to preserve the understanding of each source image, we introduce a reconstruction-alignment loss to maintain consistency between fused outputs and inputs. Finally, we employ a bilevel optimization strategy to decouple and jointly optimize reconstruction and fusion objectives, effectively balancing their coupling relationship and ensuring smooth convergence. Extensive experiments across multiple fusion tasks demonstrate UniFusion's superior visual quality, generalization ability, and adaptability to real-world scenarios. Code is available at https://github.com/dusongcheng/UniFusion.
Problem

Research questions and friction points this paper is trying to address.

image fusion
source information preservation
cross-task generalization
multi-modal fusion
feature degradation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified Image Fusion
DINOv3 Feature Extraction
Reconstruction-Alignment Loss
Bilevel Optimization
Cross-Task Generalization
🔎 Similar Papers
No similar papers found.