Geospatial foundation-model embeddings improve population estimation unevenly across space and scale

📅 2026-05-02

📈 Citations: 0

✨ Influential: 0

career value

173K/year

🤖 AI Summary

This study addresses the persistent challenge of subnational population estimation in regions with sparse or low-resolution census data. It presents the first systematic evaluation of embedding representations from a geospatial foundation model—the Population Dynamics Foundation Model (PDFM)—as alternative covariates for multiscale population modeling across Brazil, Nigeria, and the United States. Using geographically structured validation alongside Kullback–Leibler divergence and unexplained variance metrics, the analysis demonstrates that PDFM embeddings reduce unexplained variance by 20.1% and KL divergence by 23.2% on average, with substantially greater improvements observed at larger spatial scales in less-developed regions. However, the embeddings exhibit less transferability across spatial aggregations compared to conventional handcrafted covariates. These findings elucidate both the promise and predictable limitations of foundation models in demographic estimation tasks.

📝 Abstract

Reliable subnational population estimates are essential for applications, yet remain difficult where censuses are sparse, outdated or spatially coarse. Existing population-mapping workflows rely on hand-built geospatial covariates, such as settlement extent, night-time lights, and environmental conditions, which must be assembled and harmonised across scales and geographies. Geospatial foundation models offer an alternative by learning reusable representations of place from more multifaceted and heterogeneous data sources. Here, we benchmark Population Dynamics Foundation Model (PDFM) embeddings against the harmonised geospatial covariates for subnational population estimation in Brazil, Nigeria and the United States. Under geographically structured validation, PDFM increased predictive fit by a median of 20.1% (IQR: 10.0-33.2%, across country-model comparisons) reduction in unexplained variance, and reduced Kullback-Leibler divergence by 23.2% (9.2-26.2%). However, these gains were uneven. PDFM was most advantageous where the geospatial covariates weakly characterised settlement context, such as larger and less-developed subnational areas. Moreover, PDFM performance was scale-coupled with embeddings providing less flexible transfer across spatial aggregations than geospatial covariates. These findings showed that geospatial foundation-model representations of place can improve population estimation in data poor settings, but their benefits break down predictably under spatial scale mismatch, revealing a fundamental limitation of current geospatial AI.

Problem

Research questions and friction points this paper is trying to address.

population estimation

geospatial foundation models

subnational

data-poor settings

spatial scale

Innovation

Methods, ideas, or system contributions that make the work stand out.

geospatial foundation models

population estimation

spatial scale