LUT-Fuse: Towards Extremely Fast Infrared and Visible Image Fusion via Distillation to Learnable Look-Up Tables

📅 2025-08-29

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

To address the poor real-time performance and difficulty of deploying infrared–visible image fusion methods on low-power mobile devices, this paper proposes an efficient fusion framework based on a learnable multi-modal lookup table (MM-LUT). We innovatively design a fusion-guided dual-path LUT architecture: a low-order approximation encoding models pixel-wise mappings, while a high-level joint contextual encoding captures cross-modal semantic correlations. Furthermore, we introduce a ground-truth-free LUT knowledge distillation strategy to efficiently transfer knowledge from the heavy MM-Net to the lightweight MM-LUT. Experiments demonstrate that our method achieves over 10× faster fusion speed than current lightweight state-of-the-art approaches, while maintaining competitive fusion quality. It significantly reduces computational overhead—enabling real-time inference and low-power edge deployment.

Technology Category

Application Category

📝 Abstract

Current advanced research on infrared and visible image fusion primarily focuses on improving fusion performance, often neglecting the applicability on real-time fusion devices. In this paper, we propose a novel approach that towards extremely fast fusion via distillation to learnable lookup tables specifically designed for image fusion, termed as LUT-Fuse. Firstly, we develop a look-up table structure that utilizing low-order approximation encoding and high-level joint contextual scene encoding, which is well-suited for multi-modal fusion. Moreover, given the lack of ground truth in multi-modal image fusion, we naturally proposed the efficient LUT distillation strategy instead of traditional quantization LUT methods. By integrating the performance of the multi-modal fusion network (MM-Net) into the MM-LUT model, our method achieves significant breakthroughs in efficiency and performance. It typically requires less than one-tenth of the time compared to the current lightweight SOTA fusion algorithms, ensuring high operational speed across various scenarios, even in low-power mobile devices. Extensive experiments validate the superiority, reliability, and stability of our fusion approach. The code is available at https://github.com/zyb5/LUT-Fuse.

Problem

Research questions and friction points this paper is trying to address.

Achieving real-time infrared and visible image fusion

Reducing computational cost for mobile deployment

Maintaining performance without ground truth data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Learnable lookup tables for fast fusion

Distillation strategy without ground truth

Low-order approximation and contextual encoding

🔎 Similar Papers

MaeFuse: Transferring Omni Features with Pretrained Masked Autoencoders for Infrared and Visible Image Fusion via Guided Training