Online Data Curation for Object Detection via Marginal Contributions to Dataset-level Average Precision

📅 2025-11-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In object detection, existing online data selection methods suffer from architectural complexity and domain shift, hindering effective sample value assessment. This paper proposes DetGain—the first framework to introduce marginal contribution modeling into online data selection for object detection. DetGain dynamically estimates each sample’s marginal gain in mean Average Precision (mAP) by quantifying its perturbation effect on the global mAP and integrating teacher-student prediction discrepancies. It requires no detector architecture modification, relying solely on prediction quality evaluation, global score distribution modeling, and teacher-student divergence analysis—ensuring low intrusiveness and strong generalizability. Extensive experiments across multiple detectors on COCO demonstrate that DetGain significantly accelerates convergence and improves final accuracy, exhibits robustness to noisy or low-quality data, and seamlessly synergizes with knowledge distillation for further performance gains.

Technology Category

Application Category

📝 Abstract
High-quality data has become a primary driver of progress under scale laws, with curated datasets often outperforming much larger unfiltered ones at lower cost. Online data curation extends this idea by dynamically selecting training samples based on the model's evolving state. While effective in classification and multimodal learning, existing online sampling strategies rarely extend to object detection because of its structural complexity and domain gaps. We introduce DetGain, an online data curation method specifically for object detection that estimates the marginal perturbation of each image to dataset-level Average Precision (AP) based on its prediction quality. By modeling global score distributions, DetGain efficiently estimates the global AP change and computes teacher-student contribution gaps to select informative samples at each iteration. The method is architecture-agnostic and minimally intrusive, enabling straightforward integration into diverse object detection architectures. Experiments on the COCO dataset with multiple representative detectors show consistent improvements in accuracy. DetGain also demonstrates strong robustness under low-quality data and can be effectively combined with knowledge distillation techniques to further enhance performance, highlighting its potential as a general and complementary strategy for data-efficient object detection.
Problem

Research questions and friction points this paper is trying to address.

Develops online data curation method for object detection training
Estimates image contributions to dataset-level Average Precision metric
Selects informative samples dynamically during model training process
Innovation

Methods, ideas, or system contributions that make the work stand out.

Online curation method for object detection
Estimates marginal perturbation to dataset-level AP
Models global score distributions for sample selection
🔎 Similar Papers
No similar papers found.
Z
Zitang Sun
Sony Group Corporation
Masakazu Yoshimura
Masakazu Yoshimura
Sony Group Corporation
Computer Vision and Pattern RecognitionArtificial Intelligence
J
Junji Otsuka
Sony Group Corporation
A
Atsushi Irie
Sony Group Corporation
T
Takeshi Ohashi
Sony Group Corporation