Rethinking Weak-to-Strong Augmentation in Source-Free Domain Adaptive Object Detection

📅 2024-10-07
🏛️ arXiv.org
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
Strong data augmentation in mean teacher-based source-free domain adaptive object detection often erases category semantics, leading to noisy pseudo-labels and class confusion. This work first identifies the critical semantic degradation problem arising from weak–strong augmentation pairs. To address it, we propose Weak–Strong Contrastive Learning (WSCoL): a dual-branch architecture with a mapping network ensures feature-space alignment; weak-feature-guided lossless knowledge distillation preserves semantic fidelity; and prototype-guided adaptive cross-class contrastive learning—augmented by uncertainty-aware background contrast optimization—enhances discriminability. Evaluated on multiple source-free object detection (SFOD) benchmarks, WSCoL achieves state-of-the-art performance, significantly improving cross-domain robustness and generalization. Crucially, it establishes an intrinsic semantic preservation mechanism for the mean teacher paradigm, overcoming a fundamental limitation of prior augmentation-dependent approaches.

Technology Category

Application Category

📝 Abstract
Source-Free domain adaptive Object Detection (SFOD) aims to transfer a detector (pre-trained on source domain) to new unlabelled target domains. Current SFOD methods typically follow the Mean Teacher framework, where weak-to-strong augmentation provides diverse and sharp contrast for self-supervised learning. However, this augmentation strategy suffers from an inherent problem called crucial semantics loss: Due to random, strong disturbance, strong augmentation is prone to losing typical visual components, hindering cross-domain feature extraction. To address this thus-far ignored limitation, this paper introduces a novel Weak-to-Strong Contrastive Learning (WSCoL) approach. The core idea is to distill semantics lossless knowledge in the weak features (from the weak/teacher branch) to guide the representation learning upon the strong features (from the strong/student branch). To achieve this, we project the original features into a shared space using a mapping network, thereby reducing the bias between the weak and strong features. Meanwhile, a weak features-guided contrastive learning is performed in a weak-to-strong manner alternatively. Specifically, we first conduct an adaptation-aware prototype-guided clustering on the weak features to generate pseudo labels for corresponding strong features matched through proposals. Sequentially, we identify positive-negative samples based on the pseudo labels and perform cross-category contrastive learning on the strong features where an uncertainty estimator encourages adaptive background contrast. Extensive experiments demonstrate that WSCoL yields new state-of-the-art performance, offering a built-in mechanism mitigating crucial semantics loss for traditional Mean Teacher framework. The code and data will be released soon.
Problem

Research questions and friction points this paper is trying to address.

Strong augmentation erases class-relevant semantic components
Artificial inter-category confusion occurs in domain adaptation
Compensating lost semantics in strongly augmented detection images
Innovation

Methods, ideas, or system contributions that make the work stand out.

WSC compensates semantics lost in strong augmentation
WSC uses weak augmentation as semantic anchors
WSC is a plug-in for SFOD pipelines
🔎 Similar Papers
No similar papers found.
J
Jiuzheng Yang
University of Shanghai for Science and Technology
S
Song Tang
University of Shanghai for Science and Technology, Universität Hamburg
Y
Yangkuiyi Zhang
University of Shanghai for Science and Technology
Shuaifeng Li
Shuaifeng Li
School of Computer Science and Engineering, University of Electronic Science and Technology of China
domain adaptationobject detection
M
Mao Ye
University of Electronic Science and Technology of China
J
Jianwei Zhang
Universität Hamburg
Xiatian Zhu
Xiatian Zhu
University of Surrey
Machine LearningComputer Vision