RT-X Net: RGB-Thermal cross attention network for Low-Light Image Enhancement

📅 2025-05-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address severe noise, strong illumination interference, and significant detail loss in nighttime low-light image enhancement, this paper proposes an RGB-thermal infrared cross-modal fusion framework. The core methodological innovation is an RGB-thermal cross-attention mechanism that enables adaptive feature alignment and complementary enhancement between the two modalities. We further introduce V-TIEE—the first registered visible-thermal night-scene enhancement benchmark dataset—comprising 50 multi-scenario image pairs. Additionally, we design an end-to-end self-attention-driven fusion network. Joint training and evaluation on LLVIP and V-TIEE demonstrate consistent improvements: average PSNR increases by 1.8 dB and SSIM by 0.023 over state-of-the-art methods. All code and the V-TIEE dataset are publicly released.

Technology Category

Application Category

📝 Abstract
In nighttime conditions, high noise levels and bright illumination sources degrade image quality, making low-light image enhancement challenging. Thermal images provide complementary information, offering richer textures and structural details. We propose RT-X Net, a cross-attention network that fuses RGB and thermal images for nighttime image enhancement. We leverage self-attention networks for feature extraction and a cross-attention mechanism for fusion to effectively integrate information from both modalities. To support research in this domain, we introduce the Visible-Thermal Image Enhancement Evaluation (V-TIEE) dataset, comprising 50 co-located visible and thermal images captured under diverse nighttime conditions. Extensive evaluations on the publicly available LLVIP dataset and our V-TIEE dataset demonstrate that RT-X Net outperforms state-of-the-art methods in low-light image enhancement. The code and the V-TIEE can be found here https://github.com/jhakrraman/rt-xnet.
Problem

Research questions and friction points this paper is trying to address.

Enhancing low-light RGB images using thermal data fusion
Reducing noise and improving details in nighttime conditions
Integrating cross-modal features via attention mechanisms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-attention network for RGB-thermal fusion
Self-attention for feature extraction enhancement
New V-TIEE dataset for nighttime evaluation
R
Raman Jha
Indian Institute of Technology, Madras
A
Adithya Lenka
Indian Institute of Technology, Madras
M
Mani Ramanagopal
Carnegie Mellon University
A
Aswin Sankaranarayanan
Carnegie Mellon University
Kaushik Mitra
Kaushik Mitra
Department of Electrical Engineering, IIT Madras
Computational ImagingComputer VisionMachine Learning