YOLO-IOD: Towards Real Time Incremental Object Detection

📅 2025-12-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address catastrophic forgetting in YOLO-based incremental object detection—caused by foreground-background confusion, parameter interference, and misaligned knowledge distillation—this paper proposes the first efficient incremental learning method tailored for real-time YOLO frameworks. Building upon YOLO-World, we introduce three core mechanisms: conflict-aware pseudo-label optimization, importance-driven convolutional kernel selection, and cross-stage asymmetric dual-teacher distillation. We further construct LoCo COCO, a new benchmark rigorously designed to prevent inter-task data leakage. Our approach enables parameter-efficient fine-tuning and achieves state-of-the-art performance on both standard and LoCo COCO benchmarks—delivering high accuracy, low forgetting rates, and YOLO-level real-time inference speed (≥30 FPS). This work marks the first successful unification of high performance and strong practicality in incremental YOLO detection.

Technology Category

Application Category

📝 Abstract
Current methods for incremental object detection (IOD) primarily rely on Faster R-CNN or DETR series detectors; however, these approaches do not accommodate the real-time YOLO detection frameworks. In this paper, we first identify three primary types of knowledge conflicts that contribute to catastrophic forgetting in YOLO-based incremental detectors: foreground-background confusion, parameter interference, and misaligned knowledge distillation. Subsequently, we introduce YOLO-IOD, a real-time Incremental Object Detection (IOD) framework that is constructed upon the pretrained YOLO-World model, facilitating incremental learning via a stage-wise parameter-efficient fine-tuning process. Specifically, YOLO-IOD encompasses three principal components: 1) Conflict-Aware Pseudo-Label Refinement (CPR), which mitigates the foreground-background confusion by leveraging the confidence levels of pseudo labels and identifying potential objects relevant to future tasks. 2) Importancebased Kernel Selection (IKS), which identifies and updates the pivotal convolution kernels pertinent to the current task during the current learning stage. 3) Cross-Stage Asymmetric Knowledge Distillation (CAKD), which addresses the misaligned knowledge distillation conflict by transmitting the features of the student target detector through the detection heads of both the previous and current teacher detectors, thereby facilitating asymmetric distillation between existing and newly introduced categories. We further introduce LoCo COCO, a more realistic benchmark that eliminates data leakage across stages. Experiments on both conventional and LoCo COCO benchmarks show that YOLO-IOD achieves superior performance with minimal forgetting.
Problem

Research questions and friction points this paper is trying to address.

Addresses catastrophic forgetting in YOLO-based incremental object detection
Mitigates knowledge conflicts like foreground-background confusion and parameter interference
Enables real-time incremental learning using YOLO-World via efficient fine-tuning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Conflict-Aware Pseudo-Label Refinement mitigates foreground-background confusion
Importance-based Kernel Selection updates pivotal convolution kernels efficiently
Cross-Stage Asymmetric Knowledge Distillation handles misaligned distillation conflicts
🔎 Similar Papers
No similar papers found.