No Calibration, No Depth, No Problem: Cross-Sensor View Synthesis with 3D Consistency

📅 2026-02-27

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

This work addresses the challenge of cross-modal RGB-X sensor data alignment, which typically relies on expensive hardware calibration. The authors propose a novel cross-modal view synthesis method that operates without requiring depth or calibration information from the X modality. By leveraging only low-cost COLMAP processing on RGB images, the approach achieves 3D-consistent novel view synthesis through a pipeline comprising RGB-X image matching, confidence-aware guided point cloud densification, self-matching filtering, and integration with 3D Gaussian Splatting. This study presents the first demonstration of high-quality cross-modal alignment under the absence of any 3D priors from the X modality, substantially lowering the barrier for multimodal data acquisition and overcoming a key bottleneck in scaling real-world RGB-X dataset collection.

Technology Category

Application Category

📝 Abstract

We present the first study of cross-sensor view synthesis across different modalities. We examine a practical, fundamental, yet widely overlooked problem: getting aligned RGB-X data, where most RGB-X prior work assumes such pairs exist and focuses on modality fusion, but it empirically requires huge engineering effort in calibration. We propose a match-densify-consolidate method. First, we perform RGB-X image matching followed by guided point densification. Using the proposed confidence-aware densification and self-matching filtering, we attain better view synthesis and later consolidate them in 3D Gaussian Splatting (3DGS). Our method uses no 3D priors for X-sensor and only assumes nearly no-cost COLMAP for RGB. We aim to remove the cumbersome calibration for various RGB-X sensors and advance the popularity of cross-sensor learning by a scalable solution that breaks through the bottleneck in large-scale real-world RGB-X data collection.

Problem

Research questions and friction points this paper is trying to address.

cross-sensor view synthesis

RGB-X alignment

calibration-free

3D consistency

multimodal data

Innovation

Methods, ideas, or system contributions that make the work stand out.

cross-sensor view synthesis

3D consistency

calibration-free