SOOD++: Leveraging Unlabeled Data to Boost Oriented Object Detection

📅 2024-07-01
🏛️ IEEE Transactions on Pattern Analysis and Machine Intelligence
📈 Citations: 4
Influential: 0
📄 PDF
🤖 AI Summary
Oriented object detection in aerial imagery suffers from high annotation costs, and existing semi-supervised methods are limited to axis-aligned bounding boxes. Method: This paper pioneers the extension of semi-supervised learning to oriented object detection, proposing three novel components: (1) instance-aware dense sampling (SIDS), (2) geometry-aware adaptive weighting (GAW) loss, and (3) noise-driven global consistency (NGC) regularization. These jointly model the many-to-many set-level relationship between pseudo-labels and predictions, integrating geometric feature modeling with global layout constraints. Contribution/Results: On DOTA-V1.5 and DOTA-V2.0, our method achieves state-of-the-art (SOTA) performance using only 10%–30% labeled data—outperforming prior SOTA by 2.14–2.90 mAP. Under full supervision, it attains 72.48 mAP, surpassing previous SOTA by +1.82 mAP. Moreover, the framework generalizes effectively to diverse oriented detectors and multi-view 3D detection architectures.

Technology Category

Application Category

📝 Abstract
Semi-supervised object detection (SSOD), leveraging unlabeled data to boost object detectors, has become a hot topic recently. However, existing SSOD approaches mainly focus on horizontal objects, leaving oriented objects common in aerial images unexplored. At the same time, the annotation cost of oriented objects is significantly higher than that of their horizontal counterparts (an approximate 36.5% increase in costs). Therefore, in this paper, we propose a simple yet effective Semi-supervised Oriented Object Detection method termed SOOD++. Specifically, we observe that objects from aerial images usually have arbitrary orientations, small scales, and dense distribution, which inspires the following core designs: a Simple Instance-aware Dense Sampling (SIDS) strategy is used to generate comprehensive dense pseudo-labels; the Geometry-aware Adaptive Weighting (GAW) loss dynamically modulates the importance of each pair between pseudo-label and corresponding prediction by leveraging the intricate geometric information of aerial objects; we treat aerial images as global layouts and explicitly build the many-to-many relationship between the sets of pseudo-labels and predictions via the proposed Noise-driven Global Consistency (NGC). Extensive experiments conducted on various oriented object datasets under various labeled settings demonstrate the effectiveness of our method. For example, on the DOTA-V2.0/DOTA-V1.5 benchmark, the proposed method outperforms previous state-of-the-art (SOTA) by a large margin (+2.90/2.14, +2.16/2.18, and +2.66/2.32) mAP under 10%, 20%, and 30% labeled data settings, respectively, with single-scale training and testing. More importantly, it still improves upon a strong supervised baseline with 70.66 mAP, trained using the full DOTA-V1.5 train-val set, by +1.82 mAP, resulting in a 72.48 mAP, pushing the new state-of-the-art. Moreover, our method demonstrates stable generalization ability across different oriented detectors, even for multi-view oriented 3D object detectors. The code will be made available.
Problem

Research questions and friction points this paper is trying to address.

Boosting oriented object detection using unlabeled aerial image data
Addressing arbitrary orientations and dense distribution in aerial objects
Reducing high annotation costs for oriented objects in aerial imagery
Innovation

Methods, ideas, or system contributions that make the work stand out.

Simple Instance-aware Dense Sampling for pseudo-labels
Geometry-aware Adaptive Weighting loss for aerial objects
Noise-driven Global Consistency for many-to-many relationships
🔎 Similar Papers
No similar papers found.