🤖 AI Summary
Ionospheric forecasting faces challenges including sparse observational coverage, complex multi-sphere coupling, and stringent real-time requirements. To address these, this work introduces the first open, machine learning–oriented ionospheric prediction dataset. It achieves spatiotemporal alignment and modular integration of heterogeneous data sources—including Solar Dynamics Observatory (SDO) imagery, solar wind parameters, geomagnetic indices (Kp, AE, SYM-H), F10.7 solar radio flux, JPL GIM-TEC maps, and multi-source GNSS/smartphone-derived TEC measurements—for the first time. A unified standardization pipeline is proposed to enable synergistic physics-informed and data-driven modeling. Leveraging this dataset, we systematically train and evaluate diverse spatiotemporal machine learning models. Our approach achieves high-accuracy short-term vertical TEC (vTEC) forecasting under both quiet and geomagnetically active conditions, substantially enhancing dynamic ionospheric modeling capability and operational space weather forecasting utility.
📝 Abstract
Operational forecasting of the ionosphere remains a critical space weather challenge due to sparse observations, complex coupling across geospatial layers, and a growing need for timely, accurate predictions that support Global Navigation Satellite System (GNSS), communications, aviation safety, as well as satellite operations. As part of the 2025 NASA Heliolab, we present a curated, open-access dataset that integrates diverse ionospheric and heliospheric measurements into a coherent, machine learning-ready structure, designed specifically to support next-generation forecasting models and address gaps in current operational frameworks. Our workflow integrates a large selection of data sources comprising Solar Dynamic Observatory data, solar irradiance indices (F10.7), solar wind parameters (velocity and interplanetary magnetic field), geomagnetic activity indices (Kp, AE, SYM-H), and NASA JPL's Global Ionospheric Maps of Total Electron Content (GIM-TEC). We also implement geospatially sparse data such as the TEC derived from the World-Wide GNSS Receiver Network and crowdsourced Android smartphone measurements. This novel heterogeneous dataset is temporally and spatially aligned into a single, modular data structure that supports both physical and data-driven modeling. Leveraging this dataset, we train and benchmark several spatiotemporal machine learning architectures for forecasting vertical TEC under both quiet and geomagnetically active conditions. This work presents an extensive dataset and modeling pipeline that enables exploration of not only ionospheric dynamics but also broader Sun-Earth interactions, supporting both scientific inquiry and operational forecasting efforts.