🤖 AI Summary
In large-scale land cover classification under sparse field annotations, predictions suffer from fragmentation and noise. Method: We propose an object-oriented deep learning framework: at the input level, a graph neural network aggregates features over remote sensing image superpixels and fuses transfer features from pretrained vision transformers (e.g., ViT) to enhance few-shot representation; at the output level, minimum mapping unit constraints and post-processing of semantic segmentation results enforce spatial consistency. Contribution/Results: This work is the first systematic exploration of object-based deep learning under sparse supervision for joint modeling of medium-resolution imagery (e.g., Sentinel-2) and sparse ground-truth labels (e.g., LUCAS). Experiments demonstrate that our method significantly improves spatial coherence while preserving classification accuracy—input-level aggregation enhances robustness in data-scarce regimes, whereas output-level aggregation excels in large-data scenarios. Overall, it outperforms state-of-the-art operational products.
📝 Abstract
Large-scale land cover maps generated using deep learning play a critical role across a wide range of Earth science applications. Open in-situ datasets from principled land cover surveys offer a scalable alternative to manual annotation for training such models. However, their sparse spatial coverage often leads to fragmented and noisy predictions when used with existing deep learning-based land cover mapping approaches. A promising direction to address this issue is object-based classification, which assigns labels to semantically coherent image regions rather than individual pixels, thereby imposing a minimum mapping unit. Despite this potential, object-based methods remain underexplored in deep learning-based land cover mapping pipelines, especially in the context of medium-resolution imagery and sparse supervision. To address this gap, we propose LC-SLab, the first deep learning framework for systematically exploring object-based deep learning methods for large-scale land cover classification under sparse supervision. LC-SLab supports both input-level aggregation via graph neural networks, and output-level aggregation by postprocessing results from established semantic segmentation models. Additionally, we incorporate features from a large pre-trained network to improve performance on small datasets. We evaluate the framework on annual Sentinel-2 composites with sparse LUCAS labels, focusing on the tradeoff between accuracy and fragmentation, as well as sensitivity to dataset size. Our results show that object-based methods can match or exceed the accuracy of common pixel-wise models while producing substantially more coherent maps. Input-level aggregation proves more robust on smaller datasets, whereas output-level aggregation performs best with more data. Several configurations of LC-SLab also outperform existing land cover products, highlighting the framework's practical utility.