RS-OOD: A Vision-Language Augmented Framework for Out-of-Distribution Detection in Remote Sensing

📅 2025-09-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Remote sensing image out-of-distribution (OOD) detection faces challenges including severe label scarcity, complex multi-scale spatial structures, and significant distribution shifts, leading to poor generalization of existing methods. To address these issues, we propose a vision-language collaborative OOD detection framework tailored for remote sensing. First, we introduce a spatial feature enhancement module that explicitly models geospatial relationships. Second, we design a dual-path prompt alignment mechanism to achieve fine-grained matching between image regions and textual semantics. Third, we incorporate a confidence-guided self-training loop that jointly leverages self-supervised pseudo-label mining and multi-scale feature alignment. Evaluated on multiple remote sensing benchmarks, our method achieves substantial improvements over state-of-the-art approaches using only minimal labeled data, demonstrating the effectiveness of joint spatial-semantic modeling in enhancing OOD detection robustness.

Technology Category

Application Category

📝 Abstract
Out-of-distribution (OOD) detection represents a critical challenge in remote sensing applications, where reliable identification of novel or anomalous patterns is essential for autonomous monitoring, disaster response, and environmental assessment. Despite remarkable progress in OOD detection for natural images, existing methods and benchmarks remain poorly suited to remote sensing imagery due to data scarcity, complex multi-scale scene structures, and pronounced distribution shifts. To this end, we propose RS-OOD, a novel framework that leverages remote sensing-specific vision-language modeling to enable robust few-shot OOD detection. Our approach introduces three key innovations: spatial feature enhancement that improved scene discrimination, a dual-prompt alignment mechanism that cross-verifies scene context against fine-grained semantics for spatial-semantic consistency, and a confidence-guided self-training loop that dynamically mines pseudo-labels to expand training data without manual annotation. RS-OOD consistently outperforms existing methods across multiple remote sensing benchmarks and enables efficient adaptation with minimal labeled data, demonstrating the critical value of spatial-semantic integration.
Problem

Research questions and friction points this paper is trying to address.

Detecting novel patterns in remote sensing imagery
Addressing data scarcity and complex scene structures
Enabling robust few-shot out-of-distribution detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Vision-language modeling for few-shot OOD detection
Dual-prompt alignment ensuring spatial-semantic consistency
Confidence-guided self-training with pseudo-label mining
🔎 Similar Papers
No similar papers found.
Y
Yingrui Ji
Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, 100101, China
Jiansheng Chen
Jiansheng Chen
School of Computer and Communication Engineering, University of Science and Technology Beijing
Computer VisionMachine Learning
J
Jingbo Chen
Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, 100101, China
A
Anzhi Yue
Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, 100101, China
Chenhao Wang
Chenhao Wang
Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, 100101, China
K
Kai Li
Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, 100101, China
Y
Yao Zhu
Zhejiang University, Hangzhou, 310058, China