IndiaWeatherBench: A Dataset and Benchmark for Data-Driven Regional Weather Forecasting over India

📅 2025-08-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the lack of standardized datasets and evaluation protocols—and consequently poor reproducibility—in weather forecasting research over the Indian subcontinent, this work introduces IndiaWeather-Bench: the first high-resolution, standardized regional weather forecasting benchmark. Built upon ERA5-Land reanalysis data, it features multi-variable, multi-scale spatiotemporal sequences. Methodologically, it integrates state-of-the-art architectures—including UNet, Vision Transformers, and Graph Neural Networks—to support both deterministic and probabilistic forecasting under a unified evaluation protocol. Key contributions are: (1) open-sourcing a high-quality dataset, fully reproducible training/evaluation code, and strong baseline models; (2) establishing the first reproducible, extensible meteorological forecasting benchmark tailored to South Asia; and (3) significantly enhancing fair model comparison and research accessibility for regional weather modeling, thereby supporting climate adaptation and extreme-weather early warning systems.

Technology Category

Application Category

📝 Abstract
Regional weather forecasting is a critical problem for localized climate adaptation, disaster mitigation, and sustainable development. While machine learning has shown impressive progress in global weather forecasting, regional forecasting remains comparatively underexplored. Existing efforts often use different datasets and experimental setups, limiting fair comparison and reproducibility. We introduce IndiaWeatherBench, a comprehensive benchmark for data-driven regional weather forecasting focused on the Indian subcontinent. IndiaWeatherBench provides a curated dataset built from high-resolution regional reanalysis products, along with a suite of deterministic and probabilistic metrics to facilitate consistent training and evaluation. To establish strong baselines, we implement and evaluate a range of models across diverse architectures, including UNets, Transformers, and Graph-based networks, as well as different boundary conditioning strategies and training objectives. While focused on India, IndiaWeatherBench is easily extensible to other geographic regions. We open-source all raw and preprocessed datasets, model implementations, and evaluation pipelines to promote accessibility and future development. We hope IndiaWeatherBench will serve as a foundation for advancing regional weather forecasting research. Code is available at https://github.com/tung-nd/IndiaWeatherBench.
Problem

Research questions and friction points this paper is trying to address.

Addresses regional weather forecasting challenges in India
Provides benchmark dataset for fair model comparison
Evaluates diverse ML architectures for forecasting accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Regional reanalysis dataset for India
Multiple model architectures evaluated
Open-source data and evaluation pipeline
🔎 Similar Papers
No similar papers found.