🤖 AI Summary
This study addresses the challenge of effectively leveraging large-scale unlabeled images to improve object detection performance under limited annotation budgets. We systematically evaluate three prominent semi-supervised object detection methods—MixPL, Semi-DETR, and Consistent-Teacher—across MS-COCO, Pascal VOC, and a custom Beetle dataset, analyzing their trade-offs among accuracy, model size, and inference latency under varying labeling ratios. For the first time, we reveal consistent patterns of performance degradation as labeled data decreases, both on general-purpose and domain-specific datasets. Our empirical findings provide actionable insights and practical guidance for selecting appropriate semi-supervised approaches in resource-constrained scenarios.
📝 Abstract
Learning in data-scarce settings has recently gained significant attention in the research community. Semi-supervised object detection(SSOD) aims to improve detection performance by leveraging a large number of unlabeled images alongside a limited number of labeled images(a.k.a.,few-shot learning). In this paper, we present a comprehensive comparison of three state-of-the-art SSOD approaches, including MixPL, Semi-DETR and Consistent-Teacher, with the goal of understanding how performance varies with the number of labeled images. We conduct experiments using the MS-COCO and Pascal VOC datasets, two popular object detection benchmarks which allow for standardized evaluation. In addition, we evaluate the SSOD approaches on a custom Beetle dataset which enables us to gain insights into their performance on specialized datasets with a smaller number of object categories. Our findings highlight the trade-offs between accuracy, model size, and latency, providing insights into which methods are best suited for low-data regimes.