LandSegmenter: Towards a Flexible Foundation Model for Land Use and Land Cover Mapping

📅 2025-11-11

📈 Citations: 0

✨ Influential: 0

career value

171K/year

🤖 AI Summary

Existing land use and land cover (LULC) mapping models suffer from poor generalizability, heavy reliance on strong supervision, and difficulty adapting to multimodal remote sensing data and heterogeneous classification schemas. Both task-agnostic and task-specific foundation models in remote sensing face bottlenecks including high fine-tuning costs and severe label scarcity. Method: We propose a flexible foundation model for LULC mapping: (i) constructing LAS, a large-scale weakly supervised multimodal dataset; (ii) designing remote sensing–specific adapters and a text-enhancement module to fuse cross-modal features and semantic priors; and (iii) introducing a class-confidence-guided weighted fusion strategy to boost zero-shot transferability. Contribution/Results: Evaluated across six heterogeneous datasets—including optical, SAR, and LiDAR modalities under diverse classification systems—the model significantly outperforms state-of-the-art methods, demonstrating superior generalization—especially in unseen modalities and novel classes—without requiring task-specific annotations.

Technology Category

Application Category

📝 Abstract

Land Use and Land Cover (LULC) mapping is a fundamental task in Earth Observation (EO). However, current LULC models are typically developed for a specific modality and a fixed class taxonomy, limiting their generability and broader applicability. Recent advances in foundation models (FMs) offer promising opportunities for building universal models. Yet, task-agnostic FMs often require fine-tuning for downstream applications, whereas task-specific FMs rely on massive amounts of labeled data for training, which is costly and impractical in the remote sensing (RS) domain. To address these challenges, we propose LandSegmenter, an LULC FM framework that resolves three-stage challenges at the input, model, and output levels. From the input side, to alleviate the heavy demand on labeled data for FM training, we introduce LAnd Segment (LAS), a large-scale, multi-modal, multi-source dataset built primarily with globally sampled weak labels from existing LULC products. LAS provides a scalable, cost-effective alternative to manual annotation, enabling large-scale FM training across diverse LULC domains. For model architecture, LandSegmenter integrates an RS-specific adapter for cross-modal feature extraction and a text encoder for semantic awareness enhancement. At the output stage, we introduce a class-wise confidence-guided fusion strategy to mitigate semantic omissions and further improve LandSegmenter's zero-shot performance. We evaluate LandSegmenter on six precisely annotated LULC datasets spanning diverse modalities and class taxonomies. Extensive transfer learning and zero-shot experiments demonstrate that LandSegmenter achieves competitive or superior performance, particularly in zero-shot settings when transferred to unseen datasets. These results highlight the efficacy of our proposed framework and the utility of weak supervision for building task-specific FMs.

Problem

Research questions and friction points this paper is trying to address.

Developing flexible foundation models for land use mapping across different data modalities

Overcoming limitations of task-specific models requiring massive labeled remote sensing data

Addressing semantic omissions and improving zero-shot performance in LULC classification

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses weak labels from existing products for cost-effective training

Integrates cross-modal adapter and text encoder for feature extraction

Implements confidence-guided fusion to enhance zero-shot performance

🔎 Similar Papers

No similar papers found.