Connecting the Dots: A Machine Learning Ready Dataset for Ionospheric Forecasting Models

📅 2025-11-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Ionospheric forecasting faces challenges including sparse observational coverage, complex multi-sphere coupling, and stringent real-time requirements. To address these, this work introduces the first open, machine learning–oriented ionospheric prediction dataset. It achieves spatiotemporal alignment and modular integration of heterogeneous data sources—including Solar Dynamics Observatory (SDO) imagery, solar wind parameters, geomagnetic indices (Kp, AE, SYM-H), F10.7 solar radio flux, JPL GIM-TEC maps, and multi-source GNSS/smartphone-derived TEC measurements—for the first time. A unified standardization pipeline is proposed to enable synergistic physics-informed and data-driven modeling. Leveraging this dataset, we systematically train and evaluate diverse spatiotemporal machine learning models. Our approach achieves high-accuracy short-term vertical TEC (vTEC) forecasting under both quiet and geomagnetically active conditions, substantially enhancing dynamic ionospheric modeling capability and operational space weather forecasting utility.

Technology Category

Application Category

📝 Abstract
Operational forecasting of the ionosphere remains a critical space weather challenge due to sparse observations, complex coupling across geospatial layers, and a growing need for timely, accurate predictions that support Global Navigation Satellite System (GNSS), communications, aviation safety, as well as satellite operations. As part of the 2025 NASA Heliolab, we present a curated, open-access dataset that integrates diverse ionospheric and heliospheric measurements into a coherent, machine learning-ready structure, designed specifically to support next-generation forecasting models and address gaps in current operational frameworks. Our workflow integrates a large selection of data sources comprising Solar Dynamic Observatory data, solar irradiance indices (F10.7), solar wind parameters (velocity and interplanetary magnetic field), geomagnetic activity indices (Kp, AE, SYM-H), and NASA JPL's Global Ionospheric Maps of Total Electron Content (GIM-TEC). We also implement geospatially sparse data such as the TEC derived from the World-Wide GNSS Receiver Network and crowdsourced Android smartphone measurements. This novel heterogeneous dataset is temporally and spatially aligned into a single, modular data structure that supports both physical and data-driven modeling. Leveraging this dataset, we train and benchmark several spatiotemporal machine learning architectures for forecasting vertical TEC under both quiet and geomagnetically active conditions. This work presents an extensive dataset and modeling pipeline that enables exploration of not only ionospheric dynamics but also broader Sun-Earth interactions, supporting both scientific inquiry and operational forecasting efforts.
Problem

Research questions and friction points this paper is trying to address.

Creating a machine learning-ready dataset for ionospheric forecasting models
Integrating diverse ionospheric and heliospheric measurements into coherent structure
Training spatiotemporal ML models to forecast vertical TEC under various conditions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrated diverse ionospheric data into machine learning structure
Aligned heterogeneous datasets into modular spatiotemporal data framework
Trained spatiotemporal ML models for ionospheric TEC forecasting
🔎 Similar Papers
No similar papers found.
L
Linnea M. Wolniewicz
Department of Information and Computer Science, University of Hawai‘i at Mānoa, USA
H
Halil S. Kelebek
Department of Engineering Science, University of Oxford, UK
S
Simone Mestici
Università degli Studi di Roma Sapienza, Rome, Italy
M
Michael D. Vergalla
Free Flight Research Lab, Sunnyvale, USA
Giacomo Acciarini
Giacomo Acciarini
Research Fellow at European Space Agency, Advanced Concepts Team
AstrodynamicsArtificial IntelligenceMachine LearningUncertainty PropagationOptimization
B
Bala Poduval
University of New Hampshire
O
Olga Verkhoglyadova
NASA Jet Propulsion Laboratory
M
Madhulika Guhathakurta
NASA Headquarters
T
Thomas E. Berger
Space Weather Technology, Research, and Education Center, University of Colorado Boulder
Atılım Güneş Baydin
Atılım Güneş Baydin
University of Oxford
Machine LearningProbabilistic ProgrammingSimulation-based InferencePhysical Sciences
Frank Soboczenski
Frank Soboczenski
Assistant Professor, University of York & Affiliate King's College London
Machine LearningHuman-Computer InteractionNatural Language ProcessingData ScienceSpace Science