SuryaBench: Benchmark Dataset for Advancing Machine Learning in Heliophysics and Space Weather Prediction

📅 2025-08-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
A critical bottleneck in solar physics and space weather forecasting is the lack of high-resolution, machine learning–ready datasets. Method: We construct the first standardized heliophysics dataset spanning a full solar cycle (May 2010–July 2024), derived from SDO/AIA and HMI observations. We introduce a unified preprocessing pipeline—including attitude correction, orbital compensation, exposure normalization, and instrument degradation modeling—to ensure spatiotemporal consistency and physical interpretability. The dataset integrates multi-wavelength EUV imagery and vector magnetograms, and provides benchmark subsets for active region segmentation, solar flare prediction, and coronal magnetic field extrapolation. Contribution/Results: This work establishes the first reproducible, comparable, task-driven data benchmark explicitly designed for AI research in solar physics. It significantly improves model development efficiency and evaluation consistency, thereby advancing the paradigm of intelligent space weather forecasting.

Technology Category

Application Category

📝 Abstract
This paper introduces a high resolution, machine learning-ready heliophysics dataset derived from NASA's Solar Dynamics Observatory (SDO), specifically designed to advance machine learning (ML) applications in solar physics and space weather forecasting. The dataset includes processed imagery from the Atmospheric Imaging Assembly (AIA) and Helioseismic and Magnetic Imager (HMI), spanning a solar cycle from May 2010 to July 2024. To ensure suitability for ML tasks, the data has been preprocessed, including correction of spacecraft roll angles, orbital adjustments, exposure normalization, and degradation compensation. We also provide auxiliary application benchmark datasets complementing the core SDO dataset. These provide benchmark applications for central heliophysics and space weather tasks such as active region segmentation, active region emergence forecasting, coronal field extrapolation, solar flare prediction, solar EUV spectra prediction, and solar wind speed estimation. By establishing a unified, standardized data collection, this dataset aims to facilitate benchmarking, enhance reproducibility, and accelerate the development of AI-driven models for critical space weather prediction tasks, bridging gaps between solar physics, machine learning, and operational forecasting.
Problem

Research questions and friction points this paper is trying to address.

Providing processed solar data for machine learning applications
Enabling benchmarking for space weather prediction tasks
Bridging gaps between solar physics and operational forecasting
Innovation

Methods, ideas, or system contributions that make the work stand out.

High-resolution heliophysics dataset from SDO
Preprocessed with roll correction and normalization
Benchmark applications for space weather forecasting
🔎 Similar Papers
No similar papers found.
S
Sujit Roy
Earth System Science Center, University of Alabama in Huntsville, AL, USA; NASA Marshall Space Flight Center, Huntsville, AL, USA
D
Dinesha V. Hegde
Department of Space Science, The University of Alabama in Huntsville, AL, USA; Center for Space Plasma and Aeronomic Research (CSPAR), The University of Alabama in Huntsville, AL, USA
Johannes Schmude
Johannes Schmude
IBM Research
A
Amy Lin
Earth System Science Center, University of Alabama in Huntsville, AL, USA
Vishal Gaur
Vishal Gaur
Earth System Science Center, University of Alabama in Huntsville, AL, USA
Rohit Lal
Rohit Lal
Earth System Science Center, University of Alabama in Huntsville, AL, USA
K
Kshitiz Mandal
Earth System Science Center, University of Alabama in Huntsville, AL, USA
T
Talwinder Singh
Georgia State University
A
Andrés Muñoz-Jaramillo
Southwest Research Institute
K
Kang Yang
Georgia State University
Chetraj Pandey
Chetraj Pandey
Assistant Professor, Department of Computer Science, Texas Christian University
Explainable Deep LearningContinual LearningSpace Weather
J
Jinsu Hong
Georgia State University
B
Berkay Aydin
Georgia State University
R
Ryan McGranaghan
NASA Jet Propulsion Laboratory
S
Spiridon Kasapis
Princeton University
Vishal Upendran
Vishal Upendran
Research Scientist, SETI Institute / LMSAL
Astronomy and AstrophysicsDeep learning.
S
Shah Bahauddin
Laboratory for Atmospheric and Space Physics, University of Colorado Boulder
D
Daniel da Silva
NASA Goddard Space Flight Center
M
Marcus Freitag
IBM Research
I
Iksha Gurung
Earth System Science Center, University of Alabama in Huntsville, AL, USA
N
Nikolai Pogorelov
Department of Space Science, The University of Alabama in Huntsville, AL, USA; Center for Space Plasma and Aeronomic Research (CSPAR), The University of Alabama in Huntsville, AL, USA
C
Campbell Watson
IBM Research
M
Manil Maskey
NASA Marshall Space Flight Center, Huntsville, AL, USA
J
Juan Bernabe-Moreno
IBM Research
Rahul Ramachandran
Rahul Ramachandran
NASA/MSFC
InformaticsData Science