SGMA: Semantic-Guided Modality-Aware Segmentation for Remote Sensing with Incomplete Multimodal Data

📅 2026-03-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Remote sensing multimodal semantic segmentation often suffers from missing modalities due to sensor failures or incomplete coverage, encountering three major challenges: modality imbalance, large intra-class variation, and cross-modal heterogeneity. To address these issues, this work proposes a Semantic-Guided Modality-Aware framework (SGMA), comprising two plug-and-play modules—Semantic-Guided Fusion (SGF) and Modality-Aware Sampling (MAS). SGMA leverages multi-scale class semantic prototypes, prototype-feature alignment for modality robustness assessment, adaptive weighted fusion, and dynamic sample reweighting to jointly mitigate intra-class variability and cross-modal inconsistency while balancing modality contributions. Extensive experiments demonstrate that SGMA significantly outperforms existing methods across multiple remote sensing datasets and backbone networks, with particularly notable performance gains under fragile or degraded modalities.

Technology Category

Application Category

📝 Abstract
Multimodal semantic segmentation integrates complementary information from diverse sensors for remote sensing Earth observation. However, practical systems often encounter missing modalities due to sensor failures or incomplete coverage, termed Incomplete Multimodal Semantic Segmentation (IMSS). IMSS faces three key challenges: (1) multimodal imbalance, where dominant modalities suppress fragile ones; (2) intra-class variation in scale, shape, and orientation across modalities; and (3) cross-modal heterogeneity with conflicting cues producing inconsistent semantic responses. Existing methods rely on contrastive learning or joint optimization, which risk over-alignment, discarding modality-specific cues or imbalanced training, favoring robust modalities, while largely overlooking intra-class variation and cross-modal heterogeneity. To address these limitations, we propose the Semantic-Guided Modality-Aware (SGMA) framework, which ensures balanced multimodal learning while reducing intra-class variation and reconciling cross-modal inconsistencies through semantic guidance. SGMA introduces two complementary plug-and-play modules: (1) Semantic-Guided Fusion (SGF) module extracts multi-scale, class-wise semantic prototypes that capture consistent categorical representations across modalities, estimates per-modality robustness based on prototype-feature alignment, and performs adaptive fusion weighted by robustness scores to mitigate intra-class variation and cross-modal heterogeneity; (2) Modality-Aware Sampling (MAS) module leverages robustness estimations from SGF to dynamically reweight training samples, prioritizing challenging samples from fragile modalities to address modality imbalance. Extensive experiments across multiple datasets and backbones demonstrate that SGMA consistently outperforms state-of-the-art methods, with particularly significant improvements in fragile modalities.
Problem

Research questions and friction points this paper is trying to address.

Incomplete Multimodal Semantic Segmentation
Multimodal Imbalance
Intra-class Variation
Cross-modal Heterogeneity
Remote Sensing
Innovation

Methods, ideas, or system contributions that make the work stand out.

Semantic-Guided Fusion
Modality-Aware Sampling
Incomplete Multimodal Semantic Segmentation
Cross-Modal Heterogeneity
Intra-Class Variation
🔎 Similar Papers
No similar papers found.
L
Lekang Wen
State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University
Liang Liao
Liang Liao
Xidian University
Image/Video ProcessingVideo Quality Assessment
Jing Xiao
Jing Xiao
Beijing Key Laboratory of Learning and Cognition, School of Psychology, Capital Normal University
cognitive vulnerability to depressionschool psychologycognition and learning
M
Mi Wang
State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University