🤖 AI Summary
Existing active learning methods for object detection struggle to align sample informativeness estimation with task-specific performance metrics such as mAP, resulting in suboptimal annotation efficiency. To address this, we propose the first reinforcement learning (RL)-based active learning framework explicitly optimized for mAP improvement. Our approach uniquely models the expected change in model output as an informativeness metric and end-to-end optimizes batch-wise sample selection. We design an LSTM-based RL agent trained via policy gradients, coupled with a fast lookup-table-based mAP approximation for efficient reward estimation. Extensive experiments on PASCAL VOC and MS COCO demonstrate significant improvements over state-of-the-art methods across diverse backbone architectures. The framework achieves an excellent trade-off between accuracy gain and computational overhead, enabling scalable and practical deployment in real-world object detection scenarios.
📝 Abstract
Active learning strategies aim to train high-performance models with minimal labeled data by selecting the most informative instances for labeling. However, existing methods for assessing data informativeness often fail to align directly with task model performance metrics, such as mean average precision (mAP) in object detection. This paper introduces Mean-AP Guided Reinforced Active Learning for Object Detection (MGRAL), a novel approach that leverages the concept of expected model output changes as informativeness for deep detection networks, directly optimizing the sampling strategy using mAP. MGRAL employs a reinforcement learning agent based on LSTM architecture to efficiently navigate the combinatorial challenge of batch sample selection and the non-differentiable nature between performance and selected batches. The agent optimizes selection using policy gradient with mAP improvement as the reward signal. To address the computational intensity of mAP estimation with unlabeled samples, we implement fast look-up tables, ensuring real-world feasibility. We evaluate MGRAL on PASCAL VOC and MS COCO benchmarks across various backbone architectures. Our approach demonstrates strong performance, establishing a new paradigm in reinforcement learning-based active learning for object detection.