Operational convection-permitting COSMO/ICON ensemble predictions at observation sites (CIENS)

📅 2025-08-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Long-term, station-level ensemble forecast benchmark data for weather and climate modeling remain scarce and poorly suited to iterative model updates. To address this, we introduce the CIENS meteorological forecast dataset, which integrates the German Weather Service’s COSMO/ICON convection-permitting ensemble forecasts (2010–2023), covering 170 observation stations, 55 variables, and hourly forecasts from 0–21 hours, initialized twice daily (00/12 UTC), with precise spatiotemporal alignment to ground-truth measurements. CIENS innovatively incorporates records of multiple model version upgrades, gridded aggregation metadata, and six long-term observational variables, all structured in a machine-learning–friendly format. It has already enabled advances in ensemble postprocessing, substantially improving ML-based forecast skill. As the first high-temporal-resolution (hourly), long-duration (14-year), reproducible, station-level benchmark, CIENS supports systematic evaluation of forecast system evolution, ensemble calibration, and climate applications.

Technology Category

Application Category

📝 Abstract
We present the CIENS dataset, which contains ensemble weather forecasts from the operational convection-permitting numerical weather prediction model of the German Weather Service. It comprises forecasts for 55 meteorological variables mapped to the locations of synoptic stations, as well as additional spatially aggregated forecasts from surrounding grid points, available for a subset of these variables. Forecasts are available at hourly lead times from 0 to 21 hours for two daily model runs initialized at 00 and 12 UTC, covering the period from December 2010 to June 2023. Additionally, the dataset provides station observations for six key variables at 170 locations across Germany: pressure, temperature, hourly precipitation accumulation, wind speed, wind direction, and wind gusts. Since the forecast are mapped to the observed locations, the data is delivered in a convenient format for analysis. The CIENS dataset complements the growing collection of benchmark datasets for weather and climate modeling. A key distinguishing feature is its long temporal extent, which encompasses multiple updates to the underlying numerical weather prediction model and thus supports investigations into how forecasting methods can account for such changes. In addition to detailing the design and contents of the CIENS dataset, we outline potential applications in ensemble post-processing, forecast verification, and related research areas. A use case focused on ensemble post-processing illustrates the benefits of incorporating the rich set of available model predictors into machine learning-based forecasting models.
Problem

Research questions and friction points this paper is trying to address.

Ensemble weather forecasts mapped to observation sites
Long-term dataset for forecast method evolution analysis
Supports post-processing, verification, and machine learning applications
Innovation

Methods, ideas, or system contributions that make the work stand out.

Convection-permitting ensemble weather forecasts
Mapped forecasts to synoptic station locations
Long temporal extent for model updates
🔎 Similar Papers
Sebastian Lerch
Sebastian Lerch
University of Marburg
Statistics and ProbabilityForecastingMachine Learning
Benedikt Schulz
Benedikt Schulz
Karlsruhe Institute of Technology
Statistics and ProbabilityForecastingMachine Learning
R
Reinhold Hess
Deutscher Wetterdienst, Offenbach, Germany
A
Annette Möller
Bielefeld University, Bielefeld, Germany
C
Cristina Primo
Deutscher Wetterdienst, Offenbach, Germany
S
Sebastian Trepte
Deutscher Wetterdienst, Offenbach, Germany
S
Susanne Theis
Deutscher Wetterdienst, Offenbach, Germany