RDNet: Region Proportion-Aware Dynamic Adaptive Salient Object Detection Network in Optical Remote Sensing Images

📅 2026-03-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges in salient object detection for remote sensing imagery, including large-scale variations of targets, difficulties in modeling global context, and high computational costs of self-attention mechanisms. To tackle these issues, the authors propose RDNet, a novel architecture built upon the Swin Transformer backbone and enhanced with three key innovations: Dynamic Adaptive Detail Awareness, Frequency-Domain Matching Context Enhancement—which integrates wavelet transforms with cross-attention—and Region Aspect-Ratio Aware Localization, which employs aspect-ratio-guided dynamic convolution kernels. This integrated approach significantly improves detection robustness and localization accuracy across multi-scale objects, achieving state-of-the-art performance on remote sensing salient object detection benchmarks.

Technology Category

Application Category

📝 Abstract
Salient object detection (SOD) in remote sensing images faces significant challenges due to large variations in object sizes, the computational cost of self-attention mechanisms, and the limitations of CNN-based extractors in capturing global context and long-range dependencies. Existing methods that rely on fixed convolution kernels often struggle to adapt to diverse object scales, leading to detail loss or irrelevant feature aggregation. To address these issues, this work aims to enhance robustness to scale variations and achieve precise object localization. We propose the Region Proportion-Aware Dynamic Adaptive Salient Object Detection Network (RDNet), which replaces the CNN backbone with the SwinTransformer for global context modeling and introduces three key modules: (1) the Dynamic Adaptive Detail-aware (DAD) module, which applies varied convolution kernels guided by object region proportions; (2) the Frequency-matching Context Enhancement (FCE) module, which enriches contextual information through wavelet interactions and attention; and (3) the Region Proportion-aware Localization (RPL) module, which employs cross-attention to highlight semantic details and integrates a Proportion Guidance (PG) block to assist the DAD module. By combining these modules, RDNet achieves robustness against scale variations and accurate localization, delivering superior detection performance compared with state-of-the-art methods.
Problem

Research questions and friction points this paper is trying to address.

salient object detection
remote sensing images
scale variation
global context
long-range dependencies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic Adaptive Convolution
SwinTransformer
Wavelet-based Context Enhancement
Region Proportion-aware Localization
Salient Object Detection
B
Bin Wan
School of Control Science and Engineering, Shandong University, Jinan 250061, China; Key Laboratory of Machine Intelligence and System Control, Ministry of Education, Jinan 250061, China
R
Runmin Cong
School of Control Science and Engineering, Shandong University, Jinan 250061, China; Key Laboratory of Machine Intelligence and System Control, Ministry of Education, Jinan 250061, China
Xiaofei Zhou
Xiaofei Zhou
Shanghai Jiao Tong University
Human-Computer InteractionEducational TechnologyAI EducationAugmented RealityLearning
Hao Fang
Hao Fang
University of Edinburgh, School of Engineering
Deep LearningMedical ImagingInverse ProblemsElectrical Impedance TomographySoft Robotics
Y
Yaoqi Sun
School of Mathematics and Computer Science, Lishui University and Lishui Institute of Hangzhou Dianzi University, Hangzhou 310018, China
Sam Kwong
Sam Kwong
Lingnan Univerity, Hong Kong
Video CodingEvolutionary ComputationMachine Learning and pattern recognition