🤖 AI Summary
This paper addresses catastrophic forgetting and feature confusion—both cross-session and intra-session—in multi-label class-incremental learning (MLCIL), proposing the first class-agnostic incremental learning paradigm. Methodologically, it introduces a class-specific token mechanism and a two-stage loss function comprising a cross-session discriminative loss and a class-level embedding optimization loss, enabling class-level (rather than image-level) feature decoupling and updating within a Class-Independent Incremental Network (CINet). This framework is the first to support genuine class-incremental multi-label recognition, substantially mitigating representation confusion and forgetting. Evaluated on MS-COCO and PASCAL VOC, it achieves significant mAP improvements, reduces average forgetting by 32.7%, and boosts new-class recognition accuracy by 18.4%, demonstrating superior accuracy and stability.
📝 Abstract
Current research on class-incremental learning primarily focuses on single-label classification tasks. However, real-world applications often involve multi-label scenarios, such as image retrieval and medical imaging. Therefore, this paper focuses on the challenging yet practical multi-label class-incremental learning (MLCIL) problem. In addition to the challenge of catastrophic forgetting, MLCIL encounters issues related to feature confusion, encompassing inter-session and intra-feature confusion. To address these problems, we propose a novel MLCIL approach called class-independent increment (CLIN). Specifically, in contrast to existing methods that extract image-level features, we propose a class-independent incremental network (CINet) to extract multiple class-level embeddings for multi-label samples. It learns and preserves the knowledge of different classes by constructing class-specific tokens. On this basis, we develop two novel loss functions, optimizing the learning of class-specific tokens and class-level embeddings, respectively. These losses aim to distinguish between new and old classes, further alleviating the problem of feature confusion. Extensive experiments on MS-COCO and PASCAL VOC datasets demonstrate the effectiveness of our method for improving recognition performance and mitigating forgetting on various MLCIL tasks.