🤖 AI Summary
To address cloud-induced data gaps in optical remote sensing time-series imagery for grassland monitoring, this paper proposes an end-to-end deep learning model that jointly exploits multispectral temporal patterns and geospatial contextual information to achieve pixel-level, high-fidelity reconstruction of cloud-contaminated regions. The method innovatively integrates spatiotemporal attention mechanisms with physically grounded loss terms—particularly radiometric consistency—and enables weakly supervised training without ground-truth cloud masks. Built upon a U-Net backbone, the architecture incorporates dedicated spatiotemporal attention modules and is optimized for Sentinel-2 preprocessing. Evaluated on a test set from Inner Mongolia grasslands, the model achieves a PSNR of 32.7 dB—11.2 dB higher than conventional interpolation methods—and significantly improves temporal continuity and phenological phase detection accuracy in NDVI time series. This work delivers a robust, generalizable solution for dynamic grassland monitoring under persistent cloud cover.