One Model for ALL: Low-Level Task Interaction Is a Key to Task-Agnostic Image Fusion

📅 2025-02-27

📈 Citations: 0

✨ Influential: 0

career value

167K/year

🤖 AI Summary

Existing image fusion methods rely on high-level semantic task interactions, suffering from semantic gaps that limit generalizability and universality. This paper proposes a low-level vision–driven paradigm—specifically, pixel-level reconstruction—as a foundation for universal fusion, eliminating task-specific semantic modeling. We introduce GIFNet, a unified representation architecture trained via multi-task joint learning under pixel-level supervision. Our key contributions are: (i) the first task-agnostic fusion framework guided by low-level task interactions; (ii) zero-shot generalization of a single model to unseen modality pairs (e.g., infrared/visible-light, MRI/PET); and (iii) emergent capability for single-modality image enhancement. GIFNet achieves state-of-the-art performance across diverse cross-modal fusion benchmarks, demonstrating superior generalizability, architectural unity, and practical applicability.

Technology Category

Application Category

📝 Abstract

Advanced image fusion methods mostly prioritise high-level missions, where task interaction struggles with semantic gaps, requiring complex bridging mechanisms. In contrast, we propose to leverage low-level vision tasks from digital photography fusion, allowing for effective feature interaction through pixel-level supervision. This new paradigm provides strong guidance for unsupervised multimodal fusion without relying on abstract semantics, enhancing task-shared feature learning for broader applicability. Owning to the hybrid image features and enhanced universal representations, the proposed GIFNet supports diverse fusion tasks, achieving high performance across both seen and unseen scenarios with a single model. Uniquely, experimental results reveal that our framework also supports single-modality enhancement, offering superior flexibility for practical applications. Our code will be available at https://github.com/AWCXV/GIFNet.

Problem

Research questions and friction points this paper is trying to address.

Low-level task interaction in image fusion

Unsupervised multimodal fusion without semantics

Single model for diverse fusion tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages low-level vision tasks

Enhances task-shared feature learning

Supports diverse fusion tasks

🔎 Similar Papers

No similar papers found.