STC-ViT: Spatio Temporal Continuous Vision Transformer for Weather Forecasting

📅 2024-02-28

📈 Citations: 3

✨ Influential: 0

career value

167K/year

🤖 AI Summary

Traditional numerical weather prediction (NWP) suffers from prohibitive computational cost, while existing discrete Transformer-based models struggle to capture the continuous spatiotemporal dynamics of atmospheric systems. To address these limitations, this work proposes Neural ODE-ViT—a novel architecture that integrates Neural Ordinary Differential Equations (Neural ODEs) into the Vision Transformer framework, enabling continuous-time attention over evolving meteorological fields. We further design a physics-informed, custom loss function that jointly enforces physical conservation laws and observational constraints. Trained exclusively on low-resolution (1.5°) data, Neural ODE-ViT achieves significantly reduced parameter count and computational overhead compared to high-resolution counterparts. Remarkably, it matches or exceeds the global medium-range forecasting performance of state-of-the-art high-resolution data-driven models, demonstrating an unprecedented balance between computational efficiency and predictive accuracy.

Technology Category

Application Category

📝 Abstract

Operational weather forecasting system relies on computationally expensive physics-based models. Recently, transformer based models have shown remarkable potential in weather forecasting achieving state-of-the-art results. However, transformers are discrete and physics-agnostic models which limit their ability to learn the continuous spatio-temporal features of the dynamical weather system. We address this issue with STC-ViT, a Spatio-Temporal Continuous Vision Transformer for weather forecasting. STC-ViT incorporates the continuous time Neural ODE layers with multi-head attention mechanism to learn the continuous weather evolution over time. The attention mechanism is encoded as a differentiable function in the transformer architecture to model the complex weather dynamics. Further, we define a customised physics informed loss for STC-ViT which penalize the model's predictions for deviating away from physical laws. We evaluate STC-ViT against operational Numerical Weather Prediction (NWP) model and several deep learning based weather forecasting models. STC-ViT, trained on 1.5-degree 6-hourly data, demonstrates computational efficiency and competitive performance compared to state-of-the-art data-driven models trained on higher-resolution data for global forecasting.

Problem

Research questions and friction points this paper is trying to address.

Develops a continuous spatio-temporal transformer for efficient weather forecasting

Reduces computational cost and parameter redundancy in deep transformer models

Achieves competitive medium-range forecasts with lower data and compute requirements

Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates Fourier Neural Operator for global spatial modeling

Uses transformer-parameterized Neural ODE for continuous-time dynamics

Achieves competitive performance with shallow single-layer transformer encoder

🔎 Similar Papers

No similar papers found.