EAPFusion: Intrinsic Evolving Auxiliary Prior Guidance for Infrared and Visible Image Fusion

📅 2026-05-03

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

Existing infrared and visible image fusion methods struggle to simultaneously emphasize salient targets and preserve fine details due to their reliance on static weights and inability to align with external semantic granularity. To address this limitation, this work proposes EAPFusion, a novel framework that eschews external auxiliary models and instead leverages an intrinsically evolvable prior. By iteratively updating a compact prior set across scales, the method dynamically generates convolutional kernels for prior-guided adaptive convolution. Additionally, a channel-wise shuffling fusion module is introduced to enhance cross-modal complementarity. Extensive experiments demonstrate that EAPFusion achieves state-of-the-art qualitative and quantitative performance on multiple benchmark datasets, while also significantly improving robustness in cross-dataset evaluations and boosting accuracy in downstream semantic segmentation tasks.

📝 Abstract

Infrared-visible image fusion aims to create an information-rich fused image by integrating the complementary thermal saliency from infrared sensing and fine textures from visible imaging. Such accurate fusion is essential for real-world perception applications in complex scenes, including nighttime autonomous driving, search and rescue, and surveillance, and can further benefit downstream tasks such as semantic segmentation. However, most existing fusion methods rely upon static trained weights that cannot adapt to scene-specific content at inference time, and often suffer from a granularity mismatch when coarse auxiliary semantics are injected, which makes it difficult to simultaneously highlight targets and preserve details. In this work, we propose EAPFusion to address these issues by using self-evolving intrinsic priors instead of relying on external auxiliary models. Concretely, EAPFusion maintains a compact set of intrinsic priors and progressively updates them across scales. These evolved priors are utilized to dynamically generate convolutional kernels, shifting the paradigm from fixed, pre-trained filters to instance-adaptive parameters via prior-conditioned dynamic convolution. Furthermore, we design a channel-level fusion module that shuffles and interleaves infrared and visible channels, applying local channel mixing to boost cross-modal complementarity. Experiments on different datasets, including cross-dataset evaluation and semantic segmentation, show that the proposed method achieves state-of-the-art quantitative and qualitative fusion results, and consistently boosts downstream performance. Code is coming soon.

Problem

Research questions and friction points this paper is trying to address.

infrared-visible image fusion

static trained weights

granularity mismatch

scene-specific adaptation

detail preservation

Innovation

Methods, ideas, or system contributions that make the work stand out.

intrinsic evolving priors

dynamic convolution

channel-level fusion