Learning to Advect: A Neural Semi-Lagrangian Architecture for Weather Forecasting

📅 2026-01-29

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

This work proposes PARADIS, a novel machine learning weather forecasting model that explicitly embeds physical structure into its architecture to address the challenge of efficiently modeling long-range transport processes such as advection, which are typically encoded implicitly in monolithic networks. By leveraging functional decomposition, PARADIS decouples prediction into distinct advection, diffusion, and reaction modules. It introduces a neural semi-Lagrangian operator to enable trajectory-based, differentiable spherical interpolation for transport. The model learns latent variables and their transport trajectories in an end-to-end manner, achieving superior forecast accuracy at 1° resolution on the ERA5 benchmark with less than one GPU-month of training—outperforming both traditional numerical models like ECMWF HRES at 0.25° resolution and state-of-the-art machine learning baselines such as GraphCast—while significantly reducing computational cost.

Technology Category

Application Category

📝 Abstract

Recent machine-learning approaches to weather forecasting often employ a monolithic architecture, where distinct physical mechanisms (advection, transport), diffusion-like mixing, thermodynamic processes, and forcing are represented implicitly within a single large network. This representation is particularly problematic for advection, where long-range transport must be treated with expensive global interaction mechanisms or through deep, stacked convolutional layers. To mitigate this, we present PARADIS, a physics-inspired global weather prediction model that imposes inductive biases on network behavior through a functional decomposition into advection, diffusion, and reaction blocks acting on latent variables. We implement advection through a Neural Semi-Lagrangian operator that performs trajectory-based transport via differentiable interpolation on the sphere, enabling end-to-end learning of both the latent modes to be transported and their characteristic trajectories. Diffusion-like processes are modeled through depthwise-separable spatial mixing, while local source terms and vertical interactions are modeled via pointwise channel interactions, enabling operator-level physical structure. PARADIS provides state-of-the-art forecast skill at a fraction of the training cost. On ERA5-based benchmarks, the 1 degree PARADIS model, with a total training cost of less than a GPU month, meets or exceeds the performance of 0.25 degree traditional and machine-learning baselines, including the ECMWF HRES forecast and DeepMind's GraphCast.

Problem

Research questions and friction points this paper is trying to address.

advection

weather forecasting

machine learning

physical processes

long-range transport

Innovation

Methods, ideas, or system contributions that make the work stand out.

Neural Semi-Lagrangian

functional decomposition

physics-inspired architecture