🤖 AI Summary
In continual object detection, replay-based methods face two key challenges: (1) historical images contain instances of classes unknown at training time but later become known, causing task interference; and (2) existing knowledge distillation approaches rely heavily on substantial inter-task class overlap and fail under low-overlap scenarios. To address these, we propose the Replay Memory Enhancement Framework (RMEF), the first to jointly optimize replay integration and cross-task label propagation—without requiring future-class priors—to improve historical sample quality and pseudo-label reliability. RMEF incorporates dynamic class-balanced sampling, replay memory refinement, and a lightweight YOLOv8-based ensemble, thereby decoupling distillation from class-overlap constraints. Extensive experiments on PASCAL VOC and COCO benchmarks demonstrate significant improvements over state-of-the-art methods, enabling efficient continual learning under resource-constrained settings.
📝 Abstract
Continual Learning (CL) aims to learn new data while remembering previously acquired knowledge. In contrast to CL for image classification, CL for Object Detection faces additional challenges such as the missing annotations problem. In this scenario, images from previous tasks may contain instances of unknown classes that could reappear as labeled in future tasks, leading to task interference in replay-based approaches. Consequently, most approaches in the literature have focused on distillation-based techniques, which are effective when there is a significant class overlap between tasks. In our work, we propose an alternative to distillation-based approaches with a novel approach called Replay Consolidation with Label Propagation for Object Detection (RCLPOD). RCLPOD enhances the replay memory by improving the quality of the stored samples through a technique that promotes class balance while also improving the quality of the ground truth associated with these samples through a technique called label propagation. RCLPOD outperforms existing techniques on well-established benchmarks such as VOC and COC. Moreover, our approach is developed to work with modern architectures like YOLOv8, making it suitable for dynamic, real-world applications such as autonomous driving and robotics, where continuous learning and resource efficiency are essential.