Hybrid Fusion: One-Minute Efficient Training for Zero-Shot Cross-Domain Image Fusion

📅 2026-02-24

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

Existing image fusion methods are often limited by insufficient adaptability or inefficiencies in deep learning model training, particularly due to mismatches between training and inference resolutions. This work proposes a hybrid fusion framework that, for the first time, decouples policy learning from pixel synthesis: a lightweight, learnable U-Net generates dynamic guidance maps to drive a fixed Laplacian pyramid kernel for linear pixel-level fusion. The approach enables full-resolution end-to-end training without reliance on external models, achieving state-of-the-art performance within 1–2 minutes of training from scratch. Moreover, it demonstrates zero-shot cross-domain generalization, effectively balancing efficiency, fidelity, and robustness across diverse tasks such as infrared-visible and medical image fusion.

Technology Category

Application Category

📝 Abstract

Image fusion seeks to integrate complementary information from multiple sources into a single, superior image. While traditional methods are fast, they lack adaptability and performance. Conversely, deep learning approaches achieve state-of-the-art (SOTA) results but suffer from critical inefficiencies: their reliance on slow, resource-intensive, patch-based training introduces a significant gap with full-resolution inference. We propose a novel hybrid framework that resolves this trade-off. Our method utilizes a learnable U-Net to generate a dynamic guidance map that directs a classic, fixed Laplacian pyramid fusion kernel. This decoupling of policy learning from pixel synthesis enables remarkably efficient full-resolution training, eliminating the train-inference gap. Consequently, our model achieves SOTA-comparable performance in about one minute on a RTX 4090 or two minutes on a consumer laptop GPU from scratch without any external model and demonstrates powerful zero-shot generalization across diverse tasks, from infrared-visible to medical imaging. By design, the fused output is linearly constructed solely from source information, ensuring high faithfulness for critical applications. The codes are available at https://github.com/Zirconium233/HybridFusion

Problem

Research questions and friction points this paper is trying to address.

image fusion

zero-shot

cross-domain

training efficiency

full-resolution inference

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid Fusion

Zero-Shot Generalization

Full-Resolution Training