🤖 AI Summary
To address three key challenges in remote sensing image cloud segmentation—fixed receptive fields, weak scene generalization, and excessive model parameters compromising real-time inference—this paper proposes a lightweight adaptive U-Net architecture. The method introduces a synergistic mechanism comprising dynamic multi-scale convolution (DMSC) and a dynamic weight-bias generator (DWBG), enabling adaptive receptive field expansion and dynamic parameterization of classification layers; depthwise separable convolutions are further integrated to reduce parameter count. The resulting model contains only 0.33 million parameters and achieves 95.3% accuracy on the SWINySEG dataset. It outperforms three state-of-the-art methods in both accuracy and inference speed, demonstrating significantly enhanced generalization across complex cloud morphologies and improved deployment efficiency.
📝 Abstract
Cloud segmentation amounts to separating cloud pixels from non-cloud pixels in an image. Current deep learning methods for cloud segmentation suffer from three issues. (a) Constrain on their receptive field due to the fixed size of the convolution kernel. (b) Lack of robustness towards different scenarios. (c) Requirement of a large number of parameters and limitations for real-time implementation. To address these issues, we propose a Dual Dynamic U-Net (DDUNet) for supervised cloud segmentation. The DDUNet adheres to a U-Net architecture and integrates two crucial modules: the dynamic multi-scale convolution (DMSC), improving merging features under different reception fields, and the dynamic weights and bias generator (DWBG) in classification layers to enhance generalization ability. More importantly, owing to the use of depth-wise convolution, the DDUNet is a lightweight network that can achieve 95.3% accuracy on the SWINySEG dataset with only 0.33M parameters, and achieve superior performance over three different configurations of the SWINySEg dataset in both accuracy and efficiency.