🤖 AI Summary
SAR image semantic annotation is challenging due to severe speckle noise, large dynamic range, and difficulty in pretraining foundation models. To address this, we propose MixMAE—a masked autoencoder tailored for Sentinel-1 SAR imagery. Its core innovation is the first incorporation of backscatter physical characteristics (i.e., intensity values) into the reconstruction loss weighting mechanism, enabling an intensity-adaptive weighted MSE loss. This design effectively suppresses interference from speckle noise and extreme pixel values during reconstruction, significantly enhancing model robustness to SAR-specific noise and improving generalization to downstream tasks. Evaluated on flood detection, MixMAE substantially outperforms standard MAE and other baselines—achieving higher accuracy under limited annotations. Our results validate the efficacy and practicality of physics-informed, intensity-weighted reconstruction for building SAR foundation models.
📝 Abstract
Foundation model approaches such as masked auto-encoders (MAE) or its variations are now being successfully applied to satellite imagery. Most of the ongoing technical validation of foundation models have been applied to optical images like RGB or multi-spectral images. Due to difficulty in semantic labeling to create datasets and higher noise content with respect to optical images, Synthetic Aperture Radar (SAR) data has not been explored a lot in the field for foundation models. Therefore, in this work as a pre-training approach, we explored masked auto-encoder, specifically MixMAE on Sentinel-1 SAR images and its impact on SAR image classification tasks. Moreover, we proposed to use the physical characteristic of SAR data for applying weighting parameter on the auto-encoder training loss (MSE) to reduce the effect of speckle noise and very high values on the SAR images. Proposed SAR intensity-based weighting of the reconstruction loss demonstrates promising results both on SAR pre-training and downstream tasks specifically on flood detection compared with the baseline model.