Online Data Curation for Object Detection via Marginal Contributions to Dataset-level Average Precision

📅 2025-11-18

📈 Citations: 0

✨ Influential: 0

career value

161K/year

🤖 AI Summary

In object detection, existing online data selection methods suffer from architectural complexity and domain shift, hindering effective sample value assessment. This paper proposes DetGain—the first framework to introduce marginal contribution modeling into online data selection for object detection. DetGain dynamically estimates each sample’s marginal gain in mean Average Precision (mAP) by quantifying its perturbation effect on the global mAP and integrating teacher-student prediction discrepancies. It requires no detector architecture modification, relying solely on prediction quality evaluation, global score distribution modeling, and teacher-student divergence analysis—ensuring low intrusiveness and strong generalizability. Extensive experiments across multiple detectors on COCO demonstrate that DetGain significantly accelerates convergence and improves final accuracy, exhibits robustness to noisy or low-quality data, and seamlessly synergizes with knowledge distillation for further performance gains.

Technology Category

Application Category

📝 Abstract

High-quality data has become a primary driver of progress under scale laws, with curated datasets often outperforming much larger unfiltered ones at lower cost. Online data curation extends this idea by dynamically selecting training samples based on the model's evolving state. While effective in classification and multimodal learning, existing online sampling strategies rarely extend to object detection because of its structural complexity and domain gaps. We introduce DetGain, an online data curation method specifically for object detection that estimates the marginal perturbation of each image to dataset-level Average Precision (AP) based on its prediction quality. By modeling global score distributions, DetGain efficiently estimates the global AP change and computes teacher-student contribution gaps to select informative samples at each iteration. The method is architecture-agnostic and minimally intrusive, enabling straightforward integration into diverse object detection architectures. Experiments on the COCO dataset with multiple representative detectors show consistent improvements in accuracy. DetGain also demonstrates strong robustness under low-quality data and can be effectively combined with knowledge distillation techniques to further enhance performance, highlighting its potential as a general and complementary strategy for data-efficient object detection.

Problem

Research questions and friction points this paper is trying to address.

Develops online data curation method for object detection training

Estimates image contributions to dataset-level Average Precision metric

Selects informative samples dynamically during model training process

Innovation

Methods, ideas, or system contributions that make the work stand out.

Online curation method for object detection

Estimates marginal perturbation to dataset-level AP

Models global score distributions for sample selection

🔎 Similar Papers

No similar papers found.