π€ AI Summary
This study addresses the challenges of boundary over-smoothing, missed detection of slender structures, and performance degradation on rare classes in nationwide land use/land cover (LULC) semantic segmentation using ALOS-2 SAR data. Building upon the SAR-W-MixMAE self-supervised pretraining framework, the authors propose three lightweight enhancements: a multi-scale decoder with high-resolution feature injection, a progressive refinement upsampling head, and a focalβDice hybrid loss incorporating an Ξ±-scaling factor to dynamically adjust class weights. These modifications significantly improve segmentation accuracy for fine-grained structures and long-tailed rare classes without increasing overall pipeline complexity. Evaluated on the Japan-wide ALOS-2 LULC benchmark, the proposed method outperforms existing approaches across all metrics, demonstrating particularly strong performance in water body delineation and rare class recognition.
π Abstract
This work focuses on national-scale land-use/land-cover (LULC) semantic segmentation using ALOS-2 single-polarization (HH) SAR data over Japan, together with a companion binary water detection task. Building on SAR-W-MixMAE self-supervised pretraining [1], we address common SAR dense-prediction failure modes, boundary over-smoothing, missed thin/slender structures, and rare-class degradation under long-tailed labels, without increasing pipeline complexity. We introduce three lightweight refinements: (i) injecting high-resolution features into multi-scale decoding, (ii) a progressive refine-up head that alternates convolutional refinement and stepwise upsampling, and (iii) an $\alpha$-scale factor that tempers class reweighting within a focal+dice objective. The resulting model yields consistent improvements on the Japan-wide ALOS-2 LULC benchmark, particularly for under-represented classes, and improves water detection across standard evaluation metrics.