🤖 AI Summary
This study addresses the challenging problem of subseasonal-to-seasonal (weeks-to-months) forecasting of wildfire spatiotemporal patterns to support fuel management and prepositioning of emergency resources. To capture Earth system multiscale couplings, we propose the first visual Transformer framework integrating teleconnection mechanisms: an heterogeneous tokenization strategy jointly encodes coarse-resolution global fields, fine-scale local drivers, and teleconnection indices (e.g., ENSO, PNA); a spatial-preserving decoder and asymmetric attention mechanism explicitly model long-range dependencies. Evaluated on the SeasFire dataset, our model achieves an AUPRC of 0.603 for four-month-ahead prediction—only marginally below 0.630 at zero lead time—and significantly outperforms U-Net++, ViT, and climatological baselines. This work establishes the first end-to-end integration of physically grounded teleconnection priors with visual Transformer architecture for wildfire forecasting.
📝 Abstract
Forecasting wildfires weeks to months in advance is difficult, yet crucial for planning fuel treatments and allocating resources. While short-term predictions typically rely on local weather conditions, long-term forecasting requires accounting for the Earth's interconnectedness, including global patterns and teleconnections. We introduce TeleViT, a Teleconnection-aware Vision Transformer that integrates (i) fine-scale local fire drivers, (ii) coarsened global fields, and (iii) teleconnection indices. This multi-scale fusion is achieved through an asymmetric tokenization strategy that produces heterogeneous tokens processed jointly by a transformer encoder, followed by a decoder that preserves spatial structure by mapping local tokens to their corresponding prediction patches.
Using the global SeasFire dataset (2001-2021, 8-day resolution), TeleViT improves AUPRC performance over U-Net++, ViT, and climatology across all lead times, including horizons up to four months. At zero lead, TeleViT with indices and global inputs reaches AUPRC 0.630 (ViT 0.617, U-Net 0.620), at 16x8day lead (around 4 months), TeleViT variants using global input maintain 0.601-0.603 (ViT 0.582, U-Net 0.578), while surpassing the climatology (0.572) at all lead times. Regional results show the highest skill in seasonally consistent fire regimes, such as African savannas, and lower skill in boreal and arid regions. Attention and attribution analyses indicate that predictions rely mainly on local tokens, with global fields and indices contributing coarse contextual information. These findings suggest that architectures explicitly encoding large-scale Earth-system context can extend wildfire predictability on subseasonal-to-seasonal timescales.