RS-OOD: A Vision-Language Augmented Framework for Out-of-Distribution Detection in Remote Sensing

📅 2025-09-02

📈 Citations: 0

✨ Influential: 0

career value

172K/year

🤖 AI Summary

Remote sensing image out-of-distribution (OOD) detection faces challenges including severe label scarcity, complex multi-scale spatial structures, and significant distribution shifts, leading to poor generalization of existing methods. To address these issues, we propose a vision-language collaborative OOD detection framework tailored for remote sensing. First, we introduce a spatial feature enhancement module that explicitly models geospatial relationships. Second, we design a dual-path prompt alignment mechanism to achieve fine-grained matching between image regions and textual semantics. Third, we incorporate a confidence-guided self-training loop that jointly leverages self-supervised pseudo-label mining and multi-scale feature alignment. Evaluated on multiple remote sensing benchmarks, our method achieves substantial improvements over state-of-the-art approaches using only minimal labeled data, demonstrating the effectiveness of joint spatial-semantic modeling in enhancing OOD detection robustness.

Technology Category

Application Category

📝 Abstract

Out-of-distribution (OOD) detection represents a critical challenge in remote sensing applications, where reliable identification of novel or anomalous patterns is essential for autonomous monitoring, disaster response, and environmental assessment. Despite remarkable progress in OOD detection for natural images, existing methods and benchmarks remain poorly suited to remote sensing imagery due to data scarcity, complex multi-scale scene structures, and pronounced distribution shifts. To this end, we propose RS-OOD, a novel framework that leverages remote sensing-specific vision-language modeling to enable robust few-shot OOD detection. Our approach introduces three key innovations: spatial feature enhancement that improved scene discrimination, a dual-prompt alignment mechanism that cross-verifies scene context against fine-grained semantics for spatial-semantic consistency, and a confidence-guided self-training loop that dynamically mines pseudo-labels to expand training data without manual annotation. RS-OOD consistently outperforms existing methods across multiple remote sensing benchmarks and enables efficient adaptation with minimal labeled data, demonstrating the critical value of spatial-semantic integration.

Problem

Research questions and friction points this paper is trying to address.

Detecting novel patterns in remote sensing imagery

Addressing data scarcity and complex scene structures

Enabling robust few-shot out-of-distribution detection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Vision-language modeling for few-shot OOD detection

Dual-prompt alignment ensuring spatial-semantic consistency

Confidence-guided self-training with pseudo-label mining

🔎 Similar Papers

Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey