🤖 AI Summary
GeoML faces a critical challenge: the heterogeneity of remote sensing data—including multi-scale, multi-temporal, multi-modal inputs with inconsistent georeferencing and formats—contrasts sharply with the lag in domain-specific software tooling. To address this, we systematically survey the GeoML software ecosystem and propose an open-source framework designed for unified remote sensing data processing, model integration, and reproducible research. Our key contributions include: (i) a novel GeoML software governance system encompassing standardized foundation model integration, spatiotemporal alignment protocols, benchmarking frameworks, and a curated pre-trained model library; and (ii) an automated, end-to-end pipeline built upon TorchGeo and eo-learn for multi-scale spatiotemporal preprocessing and modeling. We validate the framework on crop-type mapping, demonstrating its standardization, scalability, and cross-platform reproducibility—thereby advancing the deep integration of geospatial science and artificial intelligence.
📝 Abstract
Recent advances in machine learning have been supported by the emergence of domain-specific software libraries, enabling streamlined workflows and increased reproducibility. For geospatial machine learning (GeoML), the availability of Earth observation data has outpaced the development of domain libraries to handle its unique challenges, such as varying spatial resolutions, spectral properties, temporal cadence, data coverage, coordinate systems, and file formats. This chapter presents a comprehensive overview of GeoML libraries, analyzing their evolution, core functionalities, and the current ecosystem. It also introduces popular GeoML libraries such as TorchGeo, eo-learn, and Raster Vision, detailing their architecture, supported data types, and integration with ML frameworks. Additionally, it discusses common methodologies for data preprocessing, spatial--temporal joins, benchmarking, and the use of pretrained models. Through a case study in crop type mapping, it demonstrates practical applications of these tools. Best practices in software design, licensing, and testing are highlighted, along with open challenges and future directions, particularly the rise of foundation models and the need for governance in open-source geospatial software. Our aim is to guide practitioners, developers, and researchers in navigating and contributing to the rapidly evolving GeoML landscape.