Observe Less, Understand More: Cost-aware Cross-scale Observation for Remote Sensing Understanding

📅 2026-04-13

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

This work addresses the challenge of efficient high-resolution region selection under limited observation budgets to enhance remote sensing understanding. It formulates cross-scale remote sensing interpretation as a unified cost-aware optimization problem and proposes a joint framework that integrates fine-grained importance sampling with cross-patch contextual modeling. To support this research, the authors introduce GL-10M, the first large-scale, multi-resolution remote sensing benchmark dataset with tens of millions of aligned image pairs. Experimental results demonstrate that the proposed method significantly outperforms existing approaches in both recognition and retrieval tasks, achieving superior performance at the same observational cost and effectively balancing accuracy and computational overhead.

Technology Category

Application Category

📝 Abstract

Remote sensing understanding inherently requires multi-resolution observation, since different targets and application tasks demand different levels of spatial detail. While low-resolution (LR) imagery enables efficient global observation, high-resolution (HR) imagery provides critical local details at much higher acquisition cost and limited coverage. This motivates a cross-scale sensing strategy that selectively acquires HR imagery from LR-based global perception to improve task performance under constrained cost. Existing methods for HR sampling methods typically make selection decisions from isolated LR patches, which ignore fine-grained intra-patch importance and cross-patch contextual interactions, leading to fragmented feature representation and suboptimal scene reasoning under sparse HR observations. To address this issue, we formulate cross-scale remote sensing understanding as a unified cost-aware problem that couples fine-grained HR sampling with cross-patch representation prediction, enabling more effective task reasoning with fewer HR observations. Furthermore, we present GL-10M, a large-scale benchmark of 10 million spatially aligned multi-resolution images, enabling systematic evaluation of budget-constrained cross-scale reasoning in remote sensing. Extensive experiments on recognition and retrieval tasks show that our method consistently achieves a superior performance-cost trade-off.

Problem

Research questions and friction points this paper is trying to address.

remote sensing understanding

cross-scale observation

cost-aware sampling

high-resolution imagery

multi-resolution

Innovation

Methods, ideas, or system contributions that make the work stand out.

cost-aware sensing

cross-scale observation

fine-grained sampling