🤖 AI Summary
In continual learning, models suffer from catastrophic forgetting of historical tasks when no rehearsal data is available—particularly during inference, where the classifier head becomes biased toward new tasks, degrading performance on past tasks. To address memory-free continual learning, this paper proposes an inference-time adaptive correction paradigm. First, it introduces an off-task detection (OTD) mechanism to identify the task scope of each test sample. Second, it designs a dynamic classifier parameter retention strategy to preserve discriminative capability for historical tasks. Third, it incorporates a cross-task prediction calibration method leveraging both prediction confidence and task-distribution statistics. Crucially, the approach is entirely training-free: it requires no retraining, gradient updates, or regularization, and can be seamlessly integrated into any existing continual learning method. Evaluated on CIFAR-100 and ImageNet-R, it improves average accuracy by 2.7% and 2.6%, respectively, significantly enhancing generalization and robustness to historical tasks over state-of-the-art methods.
📝 Abstract
Continual learning, also known as lifelong learning or incremental learning, refers to the process by which a model learns from a stream of incoming data over time. A common problem in continual learning is the classification layer's bias towards the most recent task. Traditionally, methods have relied on incorporating data from past tasks during training to mitigate this issue. However, the recent shift in continual learning to memory-free environments has rendered these approaches infeasible. In this study, we propose a solution focused on the testing phase. We first introduce a simple Out-of-Task Detection method, OTD, designed to accurately identify samples from past tasks during testing. Leveraging OTD, we then propose: (1) an Adaptive Retention mechanism for dynamically tuning the classifier layer on past task data; (2) an Adaptive Correction mechanism for revising predictions when the model classifies data from previous tasks into classes from the current task. We name our approach Adaptive Retention&Correction (ARC). While designed for memory-free environments, ARC also proves effective in memory-based settings. Extensive experiments show that our proposed method can be plugged in to virtually any existing continual learning approach without requiring any modifications to its training procedure. Specifically, when integrated with state-of-the-art approaches, ARC achieves an average performance increase of 2.7% and 2.6% on the CIFAR-100 and Imagenet-R datasets, respectively.