GDROS: A Geometry-Guided Dense Registration Framework for Optical-SAR Images under Large Geometric Transformations

📅 2025-11-01

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

Addressing the challenging cross-modal registration of optical and SAR imagery under large geometric transformations—exacerbated by nonlinear radiometric discrepancies, severe geometric distortions, and heterogeneous noise—this paper proposes GDROS, a geometry-guided dense registration framework. Methodologically, GDROS employs a hybrid CNN-Transformer network to extract robust cross-modal features and constructs an iterative multi-scale 4D correlation volume for dense correspondence estimation. Crucially, it introduces a least-squares affine regression module that imposes explicit geometric constraints on the dense optical flow field, effectively suppressing prediction divergence in highly deformable scenes. Extensive experiments on WHU-Opt-SAR, OS, and UBCv2 benchmarks demonstrate that GDROS consistently outperforms state-of-the-art methods across varying spatial resolutions, achieving superior quantitative accuracy (e.g., lower RMSE, higher success rate) and qualitatively precise alignment.

Technology Category

Application Category

📝 Abstract

Registration of optical and synthetic aperture radar (SAR) remote sensing images serves as a critical foundation for image fusion and visual navigation tasks. This task is particularly challenging because of their modal discrepancy, primarily manifested as severe nonlinear radiometric differences (NRD), geometric distortions, and noise variations. Under large geometric transformations, existing classical template-based and sparse keypoint-based strategies struggle to achieve reliable registration results for optical-SAR image pairs. To address these limitations, we propose GDROS, a geometry-guided dense registration framework leveraging global cross-modal image interactions. First, we extract cross-modal deep features from optical and SAR images through a CNN-Transformer hybrid feature extraction module, upon which a multi-scale 4D correlation volume is constructed and iteratively refined to establish pixel-wise dense correspondences. Subsequently, we implement a least squares regression (LSR) module to geometrically constrain the predicted dense optical flow field. Such geometry guidance mitigates prediction divergence by directly imposing an estimated affine transformation on the final flow predictions. Extensive experiments have been conducted on three representative datasets WHU-Opt-SAR dataset, OS dataset, and UBCv2 dataset with different spatial resolutions, demonstrating robust performance of our proposed method across different imaging resolutions. Qualitative and quantitative results show that GDROS significantly outperforms current state-of-the-art methods in all metrics. Our source code will be released at: https://github.com/Zi-Xuan-Sun/GDROS.

Problem

Research questions and friction points this paper is trying to address.

Addresses optical-SAR image registration under large geometric transformations

Mitigates nonlinear radiometric differences and geometric distortions

Establishes pixel-wise dense correspondences using cross-modal interactions

Innovation

Methods, ideas, or system contributions that make the work stand out.

CNN-Transformer hybrid extracts cross-modal deep features

Multi-scale 4D correlation volume establishes dense correspondences

Least squares regression geometrically constrains optical flow

🔎 Similar Papers

Deep Learning-Based Point Cloud Registration: A Comprehensive Survey and Taxonomy

2024-04-22Citations: 1

Bosch Group

Hildesheim, NDS, DE

Research Scientist Intern, Multimodal Generative AI and Robotics (PhD)