GOOD: Towards Domain Generalized Orientated Object Detection

📅 2024-02-20
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the degradation of content representation and misalignment in orientation prediction for oriented object detection under unknown target domains caused by image style shifts, and formally introduces the novel task of domain-generalized oriented object detection. To enhance cross-domain robustness, we propose a dual-cooperative framework: (i) Rotation-Aware Content Consistency learning (RAC) to preserve orientation invariance, and (ii) Style-Enhanced Consistency learning (SEC) to improve content generalizability. Additionally, we integrate CLIP-guided style hallucination, rotation-invariant feature modeling, and multi-domain style disentanglement training. Extensive experiments on multiple cross-domain benchmarks demonstrate substantial improvements over state-of-the-art methods, achieving new SOTA performance. Notably, our approach significantly boosts both detection accuracy and orientation stability on unseen domains.

Technology Category

Application Category

📝 Abstract
Oriented object detection has been rapidly developed in the past few years, but most of these methods assume the training and testing images are under the same statistical distribution, which is far from reality. In this paper, we propose the task of domain generalized oriented object detection, which intends to explore the generalization of oriented object detectors on arbitrary unseen target domains. Learning domain generalized oriented object detectors is particularly challenging, as the cross-domain style variation not only negatively impacts the content representation, but also leads to unreliable orientation predictions. To address these challenges, we propose a generalized oriented object detector (GOOD). After style hallucination by the emerging contrastive language-image pre-training (CLIP), it consists of two key components, namely, rotation-aware content consistency learning (RAC) and style consistency learning (SEC). The proposed RAC allows the oriented object detector to learn stable orientation representation from style-diversified samples. The proposed SEC further stabilizes the generalization ability of content representation from different image styles. Extensive experiments on multiple cross-domain settings show the state-of-the-art performance of GOOD. Source code will be publicly available.
Problem

Research questions and friction points this paper is trying to address.

Address generalization of oriented object detection across unseen domains.
Mitigate cross-domain style variation impact on content and orientation.
Propose a detector with rotation-aware and style consistency learning.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Style hallucination using CLIP for domain generalization
Rotation-aware content consistency learning (RAC)
Style consistency learning (SEC) for stable representation
🔎 Similar Papers
No similar papers found.
Q
Qi Bi
Wuhan University, Wuhan, China
B
Beichen Zhou
Wuhan University, Wuhan, China
J
Jingjun Yi
Wuhan University, Wuhan, China
W
Wei Ji
Wuhan University, Wuhan, China
Haolan Zhan
Haolan Zhan
Monash University
Natural Language ProcessingDialogue SystemsResponsible AI
Gui-Song Xia
Gui-Song Xia
School of Artificial Intelligence, Wuhan University, China
Artificial IntelligenceComputer VisionPhotogrammetryRemote SensingRobotics