🤖 AI Summary
This work addresses class-incremental learning (CIL) under task-ID-agnostic settings, aiming to mitigate catastrophic forgetting induced by semantic drift. We first identify mean shift and covariance shift in the feature space as the primary causes of semantic drift. To address this, we propose a task-agnostic joint distribution calibration framework: (i) mean compensation to suppress category-center drift; (ii) Mahalanobis-distance regularization to align feature covariances across tasks; and (iii) feature-level self-distillation to enhance representation stability. The method operates without task identifiers and is plug-and-play compatible with mainstream CIL pipelines. Extensive experiments on standard benchmarks—including CIFAR-100, ImageNet-100, and ImageNet-1K—demonstrate significant improvements over state-of-the-art approaches, substantially alleviating forgetting of old classes. Our implementation is publicly available.
📝 Abstract
Class-incremental learning (CIL) seeks to enable a model to sequentially learn new classes while retaining knowledge of previously learned ones. Balancing flexibility and stability remains a significant challenge, particularly when the task ID is unknown. To address this, our study reveals that the gap in feature distribution between novel and existing tasks is primarily driven by differences in mean and covariance moments. Building on this insight, we propose a novel semantic drift calibration method that incorporates mean shift compensation and covariance calibration. Specifically, we calculate each class's mean by averaging its sample embeddings and estimate task shifts using weighted embedding changes based on their proximity to the previous mean, effectively capturing mean shifts for all learned classes with each new task. We also apply Mahalanobis distance constraint for covariance calibration, aligning class-specific embedding covariances between old and current networks to mitigate the covariance shift. Additionally, we integrate a feature-level self-distillation approach to enhance generalization. Comprehensive experiments on commonly used datasets demonstrate the effectiveness of our approach. The source code is available at href{https://github.com/fwu11/MACIL.git}{https://github.com/fwu11/MACIL.git}.