🤖 AI Summary
Existing multi-path hierarchical multi-label classification methods for remote sensing imagery struggle to effectively model hierarchical semantics, limiting performance gains. This work proposes a novel framework that integrates graph-aware textual descriptions for hierarchical semantic initialization with graph convolutional network (GCN)-based structural encoding. A dynamic multimodal fusion mechanism is designed to adaptively balance semantic priors and visual evidence. Furthermore, the approach introduces a hierarchy-aware loss function and a multi-path adaptive propagation mechanism, enabling, for the first time, efficient modeling of hierarchical semantics in scenarios involving multi-branch activation. Evaluated on benchmark datasets including AID, DFC-15, and MLRSNet, the method achieves up to a 42% performance improvement in few-shot settings with only a 2.6% increase in model parameters.
📝 Abstract
Hierarchical multi-label classification (HMLC) is essential for modeling structured label dependencies in remote sensing. Yet existing approaches struggle in multi-path settings, where images may activate multiple taxonomic branches, leading to underuse of hierarchical information. We propose MAPLE (Multi-Path Adaptive Propagation with Level-Aware Embeddings), a framework that integrates (i) hierarchical semantic initialization from graph-aware textual descriptions, (ii) graph-based structure encoding via graph convolutional networks (GCNs), and (iii) adaptive multi-modal fusion that dynamically balances semantic priors and visual evidence. An adaptive level-aware objective automatically selects appropriate losses per hierarchy level. Evaluations on CORINE-aligned remote sensing datasets (AID, DFC-15, and MLRSNet) show consistent improvements of up to +42% in few-shot regimes while adding only 2.6% parameter overhead, demonstrating that MAPLE effectively and efficiently models hierarchical semantics for Earth observation (EO).