Thermal Image Refinement with Depth Estimation using Recurrent Networks for Monocular ORB-SLAM3

๐Ÿ“… 2026-03-16
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the challenge of enabling monocular thermal-inertial SLAM for unmanned aerial vehicles in GPS-denied and visually degraded environments. To this end, the authors propose a lightweight supervised depth estimation network that integrates a recurrent module with a Temporal Refinement Network (T-RefNet) to extract temporally consistent depth cues from non-radiometric thermal imagery. The estimated depth is seamlessly incorporated into the ORB-SLAM3 framework. This approach achieves high-precision depth estimation and localization using only non-radiometric thermal data, eliminating the need for costly radiometric thermal cameras. Experimental results demonstrate an absolute relative depth error of 0.06 on the VIVID++ dataset and below 0.10 on a newly collected indoor non-radiometric thermal dataset, with SLAM trajectory errors consistently under 0.4 meters.

Technology Category

Application Category

๐Ÿ“ Abstract
Autonomous navigation in GPS-denied and visually degraded environments remains challenging for unmanned aerial vehicles (UAVs). To this end, we investigate the use of a monocular thermal camera as a standalone sensor on a UAV platform for real-time depth estimation and simultaneous localization and mapping (SLAM). To extract depth information from thermal images, we propose a novel pipeline employing a lightweight supervised network with recurrent blocks (RBs) integrated to capture temporal dependencies, enabling more robust predictions. The network combines lightweight convolutional backbones with a thermal refinement network (T-RefNet) to refine raw thermal inputs and enhance feature visibility. The refined thermal images and predicted depth maps are integrated into ORB-SLAM3, enabling thermal-only localization. Unlike previous methods, the network is trained on a custom non-radiometric dataset, obviating the need for high-cost radiometric thermal cameras. Experimental results on datasets and UAV flights demonstrate competitive depth accuracy and robust SLAM performance under low-light conditions. On the radiometric VIVID++ (indoor-dark) dataset, our method achieves an absolute relative error of approximately 0.06, compared to baselines exceeding 0.11. In our non-radiometric indoor set, baseline errors remain above 0.24, whereas our approach remains below 0.10. Thermal-only ORB-SLAM3 maintains a mean trajectory error under 0.4 m.
Problem

Research questions and friction points this paper is trying to address.

thermal imaging
depth estimation
monocular SLAM
UAV navigation
GPS-denied environments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Thermal SLAM
Depth Estimation
Recurrent Networks
Non-radiometric Thermal Imaging
ORB-SLAM3
๐Ÿ”Ž Similar Papers
H
Hรผrkan ลžahin
Automatic Control Group (RAT), Paderborn University, 33098 Paderborn, Germany
H
Huy Xuan Pham
Department of Electrical and Computer Engineering, Aarhus University, 8000 Aarhus C, Denmark, and also with Upteko ApS, Denmark
V
Van Huyen Dang
Automatic Control Group (RAT), Paderborn University, 33098 Paderborn, Germany
A
Alper Yegenoglu
Automatic Control Group (RAT), Paderborn University, 33098 Paderborn, Germany
Erdal Kayacan
Erdal Kayacan
Full Professor at Paderborn University
Roboticscontrolunmanned systems