🤖 AI Summary
The open-world remote sensing community lacks large-scale, multi-task benchmarks capable of comprehensively evaluating model generalization under semantic shift and covariate shift across heterogeneous domains.
Method: This paper introduces OW-RS—the first fine-grained open-world remote sensing benchmark—comprising 189 scene/object classes and five heterogeneous domains (RGB satellite ×2, RGB aerial, multispectral RGB, infrared). It explicitly models both semantic drift and covariate shift. We propose a unified cross-domain, fine-grained, multi-task evaluation framework supporting core open-world tasks: semantic shift detection, covariate adaptation, and continual learning. Benchmark construction involves multi-source data acquisition, domain-specific shift modeling, standardized evaluation protocols, and integration of mainstream algorithm baselines.
Contribution/Results: OW-RS fills a critical gap in the field, enabling systematic, realistic assessment of model robustness and generalization. It significantly enhances the comprehensiveness, difficulty, and practical relevance of open-world remote sensing evaluation.
📝 Abstract
In open-world remote sensing, deployed models must continuously adapt to a steady influx of new data, which often exhibits various shifts compared to what the model encountered during the training phase. To effectively handle the new data, models are required to detect semantic shifts, adapt to covariate shifts, and continuously update themselves. These challenges give rise to a variety of open-world tasks. However, existing open-world remote sensing studies typically train and test within a single dataset to simulate open-world conditions. Currently, there is a lack of large-scale benchmarks capable of evaluating multiple open-world tasks. In this paper, we introduce OpenEarthSensing, a large-scale fine-grained benchmark for open-world remote sensing. OpenEarthSensing includes 189 scene and objects categories, covering the vast majority of potential semantic shifts that may occur in the real world. Additionally, OpenEarthSensing encompasses five data domains with significant covariate shifts, including two RGB satellite domians, one RGB aerial domian, one MS RGB domian, and one infrared domian. The various domains provide a more comprehensive testbed for evaluating the generalization performance of open-world models. We conduct the baseline evaluation of current mainstream open-world tasks and methods on OpenEarthSensing, demonstrating that it serves as a challenging benchmark for open-world remote sensing.