🤖 AI Summary
This work addresses the significant domain shift in synthetic aperture radar (SAR) imagery caused by variations in sensors and geographic regions, which severely hampers the cross-domain generalization of semantic segmentation models. To tackle this challenge, we propose CrossEarth-SAR, the first billion-scale vision foundation model for SAR, featuring a novel physics-guided sparse mixture-of-experts (MoE) architecture augmented with physical descriptors to enhance cross-domain adaptability. We also introduce CrossEarth-SAR-200K, a unified dataset comprising 200,000 samples under both weakly and fully supervised settings, along with a comprehensive cross-domain benchmark suite encompassing eight types of domain gaps and 22 subtasks. Experiments demonstrate that our method achieves state-of-the-art performance on 20 out of 22 subtasks, with mean Intersection-over-Union (mIoU) improvements exceeding 10% in several multi-domain transfer scenarios. Code, dataset, and benchmark will be fully open-sourced.
📝 Abstract
Synthetic Aperture Radar (SAR) enables global, all-weather earth observation. However, owing to diverse imaging mechanisms, domain shifts across sensors and regions severely hinder its semantic generalization. To address this, we present CrossEarth-SAR, the first billion-scale SAR vision foundation model built upon a novel physics-guided sparse mixture-of-experts (MoE) architecture incorporating physical descriptors, explicitly designed for cross-domain semantic segmentation. To facilitate large-scale pre-training, we develop CrossEarth-SAR-200K, a weakly and fully supervised dataset that unifies public and private SAR imagery. We also introduce a benchmark suite comprising 22 sub-benchmarks across 8 distinct domain gaps, establishing the first unified standard for domain generalization semantic segmentation on SAR imagery. Extensive experiments demonstrate that CrossEarth-SAR achieves state-of-the-art results on 20 benchmarks, surpassing previous methods by over 10\% mIoU on some benchmarks under multi-gap transfer. All code, benchmark and datasets will be publicly available.