Cross-Scale Pansharpening via ScaleFormer and the PanScale Benchmark

📅 2026-02-28

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

Existing pan-sharpening methods are predominantly evaluated at low resolutions and struggle to generalize to real-world high-resolution cross-scale scenarios. To address this limitation, this work introduces the PanScale dataset and the PanScale-Bench benchmark, along with ScaleFormer—the first general-purpose framework specifically designed for cross-scale pan-sharpening. ScaleFormer incorporates a Scale-Aware Patchify module and rotary positional encoding to parse images into variable-length patch sequences, enabling effective extrapolation to unseen scales. Extensive experiments demonstrate that ScaleFormer consistently outperforms state-of-the-art methods in both fusion quality and cross-scale generalization capability.

Technology Category

Application Category

📝 Abstract

Pansharpening aims to generate high-resolution multi-spectral images by fusing the spatial detail of panchromatic images with the spectral richness of low-resolution MS data. However, most existing methods are evaluated under limited, low-resolution settings, limiting their generalization to real-world, high-resolution scenarios. To bridge this gap, we systematically investigate the data, algorithmic, and computational challenges of cross-scale pansharpening. We first introduce PanScale, the first large-scale, cross-scale pansharpening dataset, accompanied by PanScale-Bench, a comprehensive benchmark for evaluating generalization across varying resolutions and scales. To realize scale generalization, we propose ScaleFormer, a novel architecture designed for multi-scale pansharpening. ScaleFormer reframes generalization across image resolutions as generalization across sequence lengths: it tokenizes images into patch sequences of the same resolution but variable length proportional to image scale. A Scale-Aware Patchify module enables training for such variations from fixed-size crops. ScaleFormer then decouples intra-patch spatial feature learning from inter-patch sequential dependency modeling, incorporating Rotary Positional Encoding to enhance extrapolation to unseen scales. Extensive experiments show that our approach outperforms SOTA methods in fusion quality and cross-scale generalization. The datasets and source code are available upon acceptance.

Problem

Research questions and friction points this paper is trying to address.

pansharpening

cross-scale

generalization

high-resolution

benchmark

Innovation

Methods, ideas, or system contributions that make the work stand out.

ScaleFormer

PanScale

cross-scale pansharpening