MoWE : A Mixture of Weather Experts

📅 2025-09-10

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

Recent data-driven weather forecasting models have exhibited performance stagnation. To address this, we propose a novel forecasting paradigm based on multi-expert ensemble—without designing new base models—by dynamically weighting the outputs of multiple existing models via a vision Transformer-based gating network. Our key innovation lies in a spatiotemporal adaptive dynamic weighting mechanism that optimizes fusion weights according to forecast lead time and geographic location. Coupled with a lightweight training strategy, the framework enables efficient, scalable expert mixture modeling. Evaluated on a 2-day forecasting task, our method reduces RMSE by 10% relative to the best-performing AI-based meteorological model, significantly outperforming naïve averaging ensembles while incurring lower computational overhead.

Technology Category

Application Category

📝 Abstract

Data-driven weather models have recently achieved state-of-the-art performance, yet progress has plateaued in recent years. This paper introduces a Mixture of Experts (MoWE) approach as a novel paradigm to overcome these limitations, not by creating a new forecaster, but by optimally combining the outputs of existing models. The MoWE model is trained with significantly lower computational resources than the individual experts. Our model employs a Vision Transformer-based gating network that dynamically learns to weight the contributions of multiple "expert" models at each grid point, conditioned on forecast lead time. This approach creates a synthesized deterministic forecast that is more accurate than any individual component in terms of Root Mean Squared Error (RMSE). Our results demonstrate the effectiveness of this method, achieving up to a 10% lower RMSE than the best-performing AI weather model on a 2-day forecast horizon, significantly outperforming individual experts as well as a simple average across experts. This work presents a computationally efficient and scalable strategy to push the state of the art in data-driven weather prediction by making the most out of leading high-quality forecast models.

Problem

Research questions and friction points this paper is trying to address.

Combining existing weather models optimally for better forecasts

Reducing computational resources compared to individual expert models

Improving forecast accuracy beyond state-of-the-art AI weather models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixture of Experts combination approach

Vision Transformer-based gating network

Dynamic weighting of expert models

🔎 Similar Papers

Towards an End-to-End Artificial Intelligence Driven Global Weather Forecasting System