π€ AI Summary
This work addresses the "background foregrounding" problem in Transformer-based incremental object detection, where the Hungarian matcherβs enforced one-to-one assignment inadvertently treats background regions as foreground, thereby acting as a key mechanism of catastrophic forgetting. To mitigate this issue, the authors propose a Quality-guided Minimum Cost Maximum Flow (Q-MCMF) matcher that constructs a flow graph and prunes unreliable matches using geometric quality assessment, abandoning the rigid one-to-one constraint. This approach effectively prevents background regions from being erroneously supervised as foreground objects. Evaluated under various incremental settings on COCO, Q-MCMF consistently outperforms existing methods, simultaneously alleviating catastrophic forgetting and enhancing detection performance on newly introduced classes.
π Abstract
Incremental Object Detection (IOD) aims to continuously learn new object classes without forgetting previously learned ones. A persistent challenge is catastrophic forgetting, primarily attributed to background shift in conventional detectors. While pseudo-labeling mitigates this in dense detectors, we identify a novel, distinct source of forgetting specific to DETR-like architectures: background foregrounding. This arises from the exhaustiveness constraint of the Hungarian matcher, which forcibly assigns every ground truth target to one prediction, even when predictions primarily cover background regions (i.e., low IoU). This erroneous supervision compels the model to misclassify background features as specific foreground classes, disrupting learned representations and accelerating forgetting. To address this, we propose a Quality-guided Min-Cost Max-Flow (Q-MCMF) matcher. To avoid forced assignments, Q-MCMF builds a flow graph and prunes implausible matches based on geometric quality. It then optimizes for the final matching that minimizes cost and maximizes valid assignments. This strategy eliminates harmful supervision from background foregrounding while maximizing foreground learning signals. Extensive experiments on the COCO dataset under various incremental settings demonstrate that our method consistently outperforms existing state-of-the-art approaches.