🤖 AI Summary
This work addresses the challenges of salient object detection in optical remote sensing imagery—such as complex backgrounds, low contrast, irregular object shapes, and multi-scale variations—by introducing deterministic rectified flow to this task for the first time. The authors propose a guided generative framework operating in the compact latent space of a frozen variational autoencoder (VAE). By incorporating a saliency-aware discriminator and a calibrator, the model effectively balances global semantic discrimination with fine boundary delineation. Evaluated on multiple public remote sensing benchmarks, the method achieves state-of-the-art performance, generating high-quality saliency maps in only a few inference steps, thereby significantly improving both computational efficiency and detection accuracy.
📝 Abstract
Optical Remote Sensing Image Salient Object Detection (ORSI-SOD) remains challenging due to complex backgrounds, low contrast, irregular object shapes, and large variations in object scale. Existing discriminative methods directly regress saliency maps, while recent diffusion-based generative approaches suffer from stochastic sampling and high computational cost. In this paper, we propose ORSIFlow, a saliency-guided rectified flow framework that reformulates ORSI-SOD as a deterministic latent flow generation problem. ORSIFlow performs saliency mask generation in a compact latent space constructed by a frozen variational autoencoder, enabling efficient inference with only a few steps. To enhance saliency awareness, we design a Salient Feature Discriminator for global semantic discrimination and a Salient Feature Calibrator for precise boundary refinement. Extensive experiments on multiple public benchmarks show that ORSIFlow achieves state-of-the-art performance with significantly improved efficiency. Codes are available at: https://github.com/Ch3nSir/ORSIFlow.