ORBIT-2: Scaling Exascale Vision Foundation Models for Weather and Climate Downscaling

📅 2025-05-07

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

Existing AI methods for climate downscaling suffer from poor generalization and high computational complexity—particularly ViT’s self-attention, which scales quadratically (O(N²))—hindering global high-resolution modeling. To address this, we propose Reslim: a novel architecture integrating lightweight residual ViT blocks, Bayesian regularization, and TILES—a linear-complexity tiling-based sequence modeling scheme—enabling efficient processing of trillion-token climate fields. Trained at unprecedented scale across 32,768 GPUs, Reslim achieves 10 billion parameters, sustains 1.8 exaFLOPS, and demonstrates 92–98% strong scaling efficiency. It delivers the first global ultra-high-resolution downscaling at 0.9 km resolution, attaining R² scores of 0.98–0.99 against 7 km baselines. This breakthrough overcomes long-standing regional prediction bottlenecks imposed by sparse observational data and coarse-grid models, establishing a scalable foundation model for fine-grained climate analysis and decision-making.

Technology Category

Application Category

📝 Abstract

Sparse observations and coarse-resolution climate models limit effective regional decision-making, underscoring the need for robust downscaling. However, existing AI methods struggle with generalization across variables and geographies and are constrained by the quadratic complexity of Vision Transformer (ViT) self-attention. We introduce ORBIT-2, a scalable foundation model for global, hyper-resolution climate downscaling. ORBIT-2 incorporates two key innovations: (1) Residual Slim ViT (Reslim), a lightweight architecture with residual learning and Bayesian regularization for efficient, robust prediction; and (2) TILES, a tile-wise sequence scaling algorithm that reduces self-attention complexity from quadratic to linear, enabling long-sequence processing and massive parallelism. ORBIT-2 scales to 10 billion parameters across 32,768 GPUs, achieving up to 1.8 ExaFLOPS sustained throughput and 92-98% strong scaling efficiency. It supports downscaling to 0.9 km global resolution and processes sequences up to 4.2 billion tokens. On 7 km resolution benchmarks, ORBIT-2 achieves high accuracy with R^2 scores in the range of 0.98 to 0.99 against observation data.

Problem

Research questions and friction points this paper is trying to address.

Sparse observations limit regional climate decision-making

AI methods lack generalization across variables and geographies

Vision Transformer self-attention has quadratic complexity limitations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Reslim: lightweight ViT with residual learning

TILES: linear complexity self-attention algorithm

Scales to 10B params on 32K GPUs

🔎 Similar Papers

Fast, Scale-Adaptive, and Uncertainty-Aware Downscaling of Earth System Model Fields with Generative Machine Learning