BlockGPT: Spatio-Temporal Modelling of Rainfall via Frame-Level Autoregression

📅 2025-10-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Short-term precipitation nowcasting demands both high accuracy and real-time inference, yet existing autoregressive models suffer from inductive bias and slow inference, while diffusion models incur prohibitive computational overhead. To address this, we propose BlockGPT—a novel frame-level autoregressive video forecasting framework. It introduces batched tokenization to discretize 2D precipitation fields into tokens and decouples spatiotemporal modeling: intra-frame self-attention captures spatial structure, while inter-frame causal attention ensures temporal consistency. This design is architecture-agnostic and compatible with Transformer backbones. Evaluated on KNMI and SEVIR benchmarks, BlockGPT achieves state-of-the-art performance—improving event localization (F1 score +8.2%) and classification (AUC +5.6%). Crucially, it attains 31× faster single-step inference than baseline models, meeting operational real-time requirements.

Technology Category

Application Category

📝 Abstract
Predicting precipitation maps is a highly complex spatiotemporal modeling task, critical for mitigating the impacts of extreme weather events. Short-term precipitation forecasting, or nowcasting, requires models that are not only accurate but also computationally efficient for real-time applications. Current methods, such as token-based autoregressive models, often suffer from flawed inductive biases and slow inference, while diffusion models can be computationally intensive. To address these limitations, we introduce BlockGPT, a generative autoregressive transformer using batched tokenization (Block) method that predicts full two-dimensional fields (frames) at each time step. Conceived as a model-agnostic paradigm for video prediction, BlockGPT factorizes space-time by using self-attention within each frame and causal attention across frames; in this work, we instantiate it for precipitation nowcasting. We evaluate BlockGPT on two precipitation datasets, viz. KNMI (Netherlands) and SEVIR (U.S.), comparing it to state-of-the-art baselines including token-based (NowcastingGPT) and diffusion-based (DiffCast+Phydnet) models. The results show that BlockGPT achieves superior accuracy, event localization as measured by categorical metrics, and inference speeds up to 31x faster than comparable baselines.
Problem

Research questions and friction points this paper is trying to address.

Predicting short-term precipitation maps for weather nowcasting
Overcoming slow inference in token-based autoregressive models
Addressing computational intensity in diffusion-based forecasting methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

Frame-level autoregressive transformer for rainfall prediction
Batched tokenization method for full 2D field forecasting
Spatio-temporal factorization with dual attention mechanisms
🔎 Similar Papers
No similar papers found.