ThermoSplat: Cross-Modal 3D Gaussian Splatting with Feature Modulation and Geometry Decoupling

📅 2026-01-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing approaches to fusing RGB and thermal infrared data often overlook cross-modal correspondences or employ shared representations that fail to account for the distinct physical characteristics of each modality, resulting in insufficient reconstruction robustness under complex illumination and adverse weather conditions. This work proposes a cross-modal FiLM modulation mechanism that leverages thermal structural priors to guide texture synthesis, alongside a modality-adaptive geometric disentanglement strategy that enables independent modeling of thermal radiance and visible-light geometry. By integrating an explicit spherical harmonics representation with an implicit neural decoder in a hybrid rendering pipeline, the proposed method achieves high-quality multimodal rendering on the RGBT-Scenes dataset, significantly outperforming current state-of-the-art approaches.

Technology Category

Application Category

📝 Abstract
Multi-modal scene reconstruction integrating RGB and thermal infrared data is essential for robust environmental perception across diverse lighting and weather conditions. However, extending 3D Gaussian Splatting (3DGS) to multi-spectral scenarios remains challenging. Current approaches often struggle to fully leverage the complementary information of multi-modal data, typically relying on mechanisms that either tend to neglect cross-modal correlations or leverage shared representations that fail to adaptively handle the complex structural correlations and physical discrepancies between spectrums. To address these limitations, we propose ThermoSplat, a novel framework that enables deep spectral-aware reconstruction through active feature modulation and adaptive geometry decoupling. First, we introduce a Spectrum-Aware Adaptive Modulation that dynamically conditions shared latent features on thermal structural priors, effectively guiding visible texture synthesis with reliable cross-modal geometric cues. Second, to accommodate modality-specific geometric inconsistencies, we propose a Modality-Adaptive Geometric Decoupling scheme that learns independent opacity offsets and executes an independent rasterization pass for the thermal branch. Additionally, a hybrid rendering pipeline is employed to integrate explicit Spherical Harmonics with implicit neural decoding, ensuring both semantic consistency and high-frequency detail preservation. Extensive experiments on the RGBT-Scenes dataset demonstrate that ThermoSplat achieves state-of-the-art rendering quality across both visible and thermal spectrums.
Problem

Research questions and friction points this paper is trying to address.

multi-modal reconstruction
3D Gaussian Splatting
thermal infrared
cross-modal correlation
spectral-aware rendering
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-Modal FiLM Modulation
Modality-Adaptive Geometric Decoupling
3D Gaussian Splatting
Thermal Infrared Fusion
Hybrid Rendering
🔎 Similar Papers
2024-09-11arXiv.orgCitations: 1
Z
Zhaoqi Su
College of Physics and Information Engineering, Fuzhou University, Fuzhou 350108, China
S
Shihai Chen
College of Physics and Information Engineering, Fuzhou University, Fuzhou 350108, China
X
Xinyan Lin
College of Physics and Information Engineering, Fuzhou University, Fuzhou 350108, China
Liqin Huang
Liqin Huang
Fuzhou University
computer visonmaching learning
Z
Zhipeng Su
College of Physics and Information Engineering, Fuzhou University, Fuzhou 350108, China
X
Xiaoqiang Lu
College of Physics and Information Engineering, Fuzhou University, Fuzhou 350108, China