🤖 AI Summary
This study addresses the challenge of high-accuracy, high-efficiency global mapping under sparse geotagged data. We propose the first general-purpose embedding field model that consistently outperforms existing feature-based methods across diverse mapping tasks—without task-specific fine-tuning. The model integrates multi-source Earth observation data (optical, SAR, meteorological), multi-temporal sequences, and spatial contextual information, leveraging self-supervised learning to construct a unified spatiotemporal embedding representation. This enables cross-task and cross-resolution generalization—from local to global scales. Evaluated on standard benchmarks including land cover classification and change detection, our method achieves state-of-the-art performance. As a key contribution, we will publicly release a global, annual, analysis-ready embedding dataset spanning 2017–2024—providing a foundational representation resource for remote sensing mapping and dynamic Earth monitoring.
📝 Abstract
Unprecedented volumes of Earth observation data are continually collected around the world, but high-quality labels remain scarce given the effort required to make physical measurements and observations. This has led to considerable investment in bespoke modeling efforts translating sparse labels into maps. Here we introduce AlphaEarth Foundations, an embedding field model yielding a highly general, geospatial representation that assimilates spatial, temporal, and measurement contexts across multiple sources, enabling accurate and efficient production of maps and monitoring systems from local to global scales. The embeddings generated by AlphaEarth Foundations are the only to consistently outperform all previous featurization approaches tested on a diverse set of mapping evaluations without re-training. We will release a dataset of global, annual, analysis-ready embedding field layers from 2017 through 2024.