🤖 AI Summary
To address low 3D reconstruction accuracy, reliance on fully sampled SAR aperture annotations, and high data acquisition costs in sparse multi-baseline SAR imaging, this paper proposes an optical-radar cross-modal weakly supervised 3D reconstruction framework. Leveraging synthetically generated optical images as the sole supervision signal, the method enables geometry-consistent error backpropagation via differentiable rendering, integrated with cross-modal feature alignment and sparse signal modeling—achieving structured 3D scattering distribution reconstruction of vehicles without requiring ground-truth 3D SAR annotations. Key contributions include: (i) establishing the first optical-guided, weakly supervised paradigm for SAR 3D imaging; (ii) eliminating dependence on fully sampled multi-baseline SAR data; and (iii) substantially reducing data construction and preprocessing overhead. Extensive evaluations on both simulated and real-world SAR datasets demonstrate superior reconstruction accuracy over state-of-the-art compressed sensing and deep learning methods.
📝 Abstract
Multi-baseline Synthetic Aperture Radar (SAR) three-dimensional (3D) tomography is a crucial remote sensing technique that provides 3D resolution unavailable in conventional SAR imaging. However, achieving high-quality imaging typically requires multi-angle or full-aperture data, resulting in significant imaging costs. Recent advancements in sparse 3D SAR, which rely on data from limited apertures, have gained attention as a cost-effective alternative. Notably, deep learning techniques have markedly enhanced the imaging quality of sparse 3D SAR. Despite these advancements, existing methods primarily depend on high-resolution radar images for supervising the training of deep neural networks (DNNs). This exclusive dependence on single-modal data prevents the introduction of complementary information from other data sources, limiting further improvements in imaging performance. In this paper, we introduce a Cross-Modal 3D-SAR Reconstruction Network (CMAR-Net) to enhance 3D SAR imaging by integrating heterogeneous information. Leveraging cross-modal supervision from 2D optical images and error transfer guaranteed by differentiable rendering, CMAR-Net achieves efficient training and reconstructs highly sparse multi-baseline SAR data into visually structured and accurate 3D images, particularly for vehicle targets. Extensive experiments on simulated and real-world datasets demonstrate that CMAR-Net significantly outperforms SOTA sparse reconstruction algorithms based on compressed sensing (CS) and deep learning (DL). Furthermore, our method eliminates the need for time-consuming full-aperture data preprocessing and relies solely on computer-rendered optical images, significantly reducing dataset construction costs. This work highlights the potential of deep learning for multi-baseline SAR 3D imaging and introduces a novel framework for radar imaging research through cross-modal learning.