🤖 AI Summary
Remote sensing image out-of-distribution (OOD) detection faces challenges including severe label scarcity, complex multi-scale spatial structures, and significant distribution shifts, leading to poor generalization of existing methods. To address these issues, we propose a vision-language collaborative OOD detection framework tailored for remote sensing. First, we introduce a spatial feature enhancement module that explicitly models geospatial relationships. Second, we design a dual-path prompt alignment mechanism to achieve fine-grained matching between image regions and textual semantics. Third, we incorporate a confidence-guided self-training loop that jointly leverages self-supervised pseudo-label mining and multi-scale feature alignment. Evaluated on multiple remote sensing benchmarks, our method achieves substantial improvements over state-of-the-art approaches using only minimal labeled data, demonstrating the effectiveness of joint spatial-semantic modeling in enhancing OOD detection robustness.
📝 Abstract
Out-of-distribution (OOD) detection represents a critical challenge in remote sensing applications, where reliable identification of novel or anomalous patterns is essential for autonomous monitoring, disaster response, and environmental assessment. Despite remarkable progress in OOD detection for natural images, existing methods and benchmarks remain poorly suited to remote sensing imagery due to data scarcity, complex multi-scale scene structures, and pronounced distribution shifts. To this end, we propose RS-OOD, a novel framework that leverages remote sensing-specific vision-language modeling to enable robust few-shot OOD detection. Our approach introduces three key innovations: spatial feature enhancement that improved scene discrimination, a dual-prompt alignment mechanism that cross-verifies scene context against fine-grained semantics for spatial-semantic consistency, and a confidence-guided self-training loop that dynamically mines pseudo-labels to expand training data without manual annotation. RS-OOD consistently outperforms existing methods across multiple remote sensing benchmarks and enables efficient adaptation with minimal labeled data, demonstrating the critical value of spatial-semantic integration.