🤖 AI Summary
Slum detection in satellite imagery suffers from poor cross-regional generalization, primarily due to morphological heterogeneity and the absence of labeled data in target domains. To address this, we propose GRAM: a framework that introduces a large-scale, multi-continent slum remote sensing dataset and a two-stage test-time adaptation mechanism integrated with a region-aware Mixture-of-Experts (MoE) architecture. GRAM jointly leverages consistency-regularized pseudo-label filtering and shared–specific feature disentanglement learning to enable unsupervised domain adaptation. Evaluated on low-resource urban areas—particularly across Africa—GRAM significantly outperforms existing methods. It achieves high robustness in cross-domain detection using only a small number of source-domain annotations, enabling efficient, scalable global slum mapping and poverty monitoring.
📝 Abstract
Satellite-based slum segmentation holds significant promise in generating global estimates of urban poverty. However, the morphological heterogeneity of informal settlements presents a major challenge, hindering the ability of models trained on specific regions to generalize effectively to unseen locations. To address this, we introduce a large-scale high-resolution dataset and propose GRAM (Generalized Region-Aware Mixture-of-Experts), a two-phase test-time adaptation framework that enables robust slum segmentation without requiring labeled data from target regions. We compile a million-scale satellite imagery dataset from 12 cities across four continents for source training. Using this dataset, the model employs a Mixture-of-Experts architecture to capture region-specific slum characteristics while learning universal features through a shared backbone. During adaptation, prediction consistency across experts filters out unreliable pseudo-labels, allowing the model to generalize effectively to previously unseen regions. GRAM outperforms state-of-the-art baselines in low-resource settings such as African cities, offering a scalable and label-efficient solution for global slum mapping and data-driven urban planning.