Building Blocks for Robust and Effective Semi-Supervised Real-World Object Detection

📅 2025-03-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address low-quality pseudo-labels, severe class imbalance, and noisy annotations in semi-supervised object detection (SSOD), this paper proposes a plug-and-play framework comprising four novel modules: Rare Class Collage (RCC) for rare-class image mosaicking augmentation; Rare Class Focus (RCF) for hierarchical batch sampling to prioritize long-tail classes; Ground Truth Label Correction (GTLC), which refines ground-truth labels guided by teacher-student consistency; and Pseudo-Label Selection (PLS), dynamically filtering pseudo-labels based on estimated missing detection rate (MDR). Notably, we are the first to explicitly model the coupling between MDR and class rarity, jointly optimizing pseudo-label precision and recall. Extensive experiments on autonomous driving benchmarks demonstrate up to a 6% absolute improvement in SSOD performance, significantly enhancing model robustness and generalization under long-tailed distributions and label noise.

Technology Category

Application Category

📝 Abstract
Semi-supervised object detection (SSOD) based on pseudo-labeling significantly reduces dependence on large labeled datasets by effectively leveraging both labeled and unlabeled data. However, real-world applications of SSOD often face critical challenges, including class imbalance, label noise, and labeling errors. We present an in-depth analysis of SSOD under real-world conditions, uncovering causes of suboptimal pseudo-labeling and key trade-offs between label quality and quantity. Based on our findings, we propose four building blocks that can be seamlessly integrated into an SSOD framework. Rare Class Collage (RCC): a data augmentation method that enhances the representation of rare classes by creating collages of rare objects. Rare Class Focus (RCF): a stratified batch sampling strategy that ensures a more balanced representation of all classes during training. Ground Truth Label Correction (GLC): a label refinement method that identifies and corrects false, missing, and noisy ground truth labels by leveraging the consistency of teacher model predictions. Pseudo-Label Selection (PLS): a selection method for removing low-quality pseudo-labeled images, guided by a novel metric estimating the missing detection rate while accounting for class rarity. We validate our methods through comprehensive experiments on autonomous driving datasets, resulting in up to 6% increase in SSOD performance. Overall, our investigation and novel, data-centric, and broadly applicable building blocks enable robust and effective SSOD in complex, real-world scenarios. Code is available at https://mos-ks.github.io/publications.
Problem

Research questions and friction points this paper is trying to address.

Addresses class imbalance in semi-supervised object detection
Reduces label noise and errors in real-world SSOD applications
Improves pseudo-labeling quality and quantity trade-offs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Rare Class Collage enhances rare class representation
Stratified batch sampling balances class representation
Label correction refines noisy ground truth labels
🔎 Similar Papers
No similar papers found.