π€ AI Summary
Class-incremental learning (CIL) suffers from catastrophic forgetting and substantially inferior performance compared to the full-data training oracle. To address this, we propose a parameter transition framework grounded in linear mode connectivity: we first empirically identify low-loss linear paths connecting incremental solutions to the oracle in parameter space, and then design an incremental parameter update mechanism leveraging this property. Our approach incorporates an efficient vector transformation based on a diagonal Fisher information matrix approximation, ensuring compatibility with both sample-free and exemplar-based rehearsal settings. On CIFAR-100, our method improves the final accuracy of the PASS baseline by 5.12% and reduces forgetting by 2.54%. On FGVC-Aircraft, it boosts the SLCA baselineβs average accuracy by 14.93% and final accuracy by 21.95%. These gains significantly narrow the performance gap between CIL and the oracle, demonstrating the effectiveness of exploiting linear mode connectivity for robust continual adaptation.
π Abstract
Class Incremental Learning (CIL) aims to sequentially acquire knowledge of new classes without forgetting previously learned ones. Despite recent progress, current CIL methods still exhibit significant performance gaps compared to their oracle counterparts-models trained with full access to historical data. Inspired by recent insights on Linear Mode Connectivity (LMC), we revisit the geometric properties of oracle solutions in CIL and uncover a fundamental observation: these oracle solutions typically maintain low-loss linear connections to the optimum of previous tasks. Motivated by this finding, we propose Increment Vector Transformation (IVT), a novel plug-and-play framework designed to mitigate catastrophic forgetting during training. Rather than directly following CIL updates, IVT periodically teleports the model parameters to transformed solutions that preserve linear connectivity to previous task optimum. By maintaining low-loss along these connecting paths, IVT effectively ensures stable performance on previously learned tasks. The transformation is efficiently approximated using diagonal Fisher Information Matrices, making IVT suitable for both exemplar-free and exemplar-based scenarios, and compatible with various initialization strategies. Extensive experiments on CIFAR-100, FGVCAircraft, ImageNet-Subset, and ImageNet-Full demonstrate that IVT consistently enhances the performance of strong CIL baselines. Specifically, on CIFAR-100, IVT improves the last accuracy of the PASS baseline by +5.12% and reduces forgetting by 2.54%. For the CLIP-pre-trained SLCA baseline on FGVCAircraft, IVT yields gains of +14.93% in average accuracy and +21.95% in last accuracy. The code will be released.