🤖 AI Summary
This work addresses the limitations of existing Earth observation foundation models, which suffer from rigid architectures that hinder the integration of multi-source heterogeneous satellite data and are constrained by fixed patch sizes, impeding flexible trade-offs between computational cost and accuracy. To overcome these challenges, we propose THOR—the first computation-adaptive foundation model capable of unifying native-resolution data from Sentinel-1, -2, and -3. By leveraging a pretraining strategy with random patch sampling and variable input sizes, THOR enables deployment with arbitrary patch dimensions at inference time, dynamically balancing efficiency and performance without retraining. Trained on our newly curated large-scale multi-source dataset, THOR Pretrain, the model achieves state-of-the-art results across multiple downstream tasks, demonstrating exceptional robustness in data-scarce scenarios such as the PANGAEA 10% split, thereby validating its broad applicability in climate and societal applications.
📝 Abstract
Current Earth observation foundation models are architecturally rigid, struggle with heterogeneous sensors and are constrained to fixed patch sizes. This limits their deployment in real-world scenarios requiring flexible computeaccuracy trade-offs. We propose THOR, a"computeadaptive"foundation model that solves both input heterogeneity and deployment rigidity. THOR is the first architecture to unify data from Copernicus Sentinel-1, -2, and -3 (OLCI&SLSTR) satellites, processing their native 10 m to 1000 m resolutions in a single model. We pre-train THOR with a novel randomized patch and input image size strategy. This allows a single set of pre-trained weights to be deployed at inference with any patch size, enabling a dynamic trade-off between computational cost and feature resolution without retraining. We pre-train THOR on THOR Pretrain, a new, large-scale multi-sensor dataset and demonstrate state-of-the-art performance on downstream benchmarks, particularly in data-limited regimes like the PANGAEA 10% split, validating that THOR's flexible feature generation excels for diverse climate and society applications.