MoWE : A Mixture of Weather Experts

📅 2025-09-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Recent data-driven weather forecasting models have exhibited performance stagnation. To address this, we propose a novel forecasting paradigm based on multi-expert ensemble—without designing new base models—by dynamically weighting the outputs of multiple existing models via a vision Transformer-based gating network. Our key innovation lies in a spatiotemporal adaptive dynamic weighting mechanism that optimizes fusion weights according to forecast lead time and geographic location. Coupled with a lightweight training strategy, the framework enables efficient, scalable expert mixture modeling. Evaluated on a 2-day forecasting task, our method reduces RMSE by 10% relative to the best-performing AI-based meteorological model, significantly outperforming naïve averaging ensembles while incurring lower computational overhead.

Technology Category

Application Category

📝 Abstract
Data-driven weather models have recently achieved state-of-the-art performance, yet progress has plateaued in recent years. This paper introduces a Mixture of Experts (MoWE) approach as a novel paradigm to overcome these limitations, not by creating a new forecaster, but by optimally combining the outputs of existing models. The MoWE model is trained with significantly lower computational resources than the individual experts. Our model employs a Vision Transformer-based gating network that dynamically learns to weight the contributions of multiple "expert" models at each grid point, conditioned on forecast lead time. This approach creates a synthesized deterministic forecast that is more accurate than any individual component in terms of Root Mean Squared Error (RMSE). Our results demonstrate the effectiveness of this method, achieving up to a 10% lower RMSE than the best-performing AI weather model on a 2-day forecast horizon, significantly outperforming individual experts as well as a simple average across experts. This work presents a computationally efficient and scalable strategy to push the state of the art in data-driven weather prediction by making the most out of leading high-quality forecast models.
Problem

Research questions and friction points this paper is trying to address.

Combining existing weather models optimally for better forecasts
Reducing computational resources compared to individual expert models
Improving forecast accuracy beyond state-of-the-art AI weather models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixture of Experts combination approach
Vision Transformer-based gating network
Dynamic weighting of expert models
🔎 Similar Papers
No similar papers found.
D
Dibyajyoti Chakraborty
The Pennsylvania State University
Romit Maulik
Romit Maulik
Assistant Professor and ICDS Co-Hire: Pennsylvania State University
Scientific Machine LearningComputational Fluid Dynamics
Peter Harrington
Peter Harrington
NVIDIA
Deep learningartificial intelligence
D
Dallas Foster
NVIDIA Corporation
M
Mohammad Amin Nabian
NVIDIA Corporation
S
Sanjay Choudhry
NVIDIA Corporation