🤖 AI Summary
To address severe catastrophic forgetting that impedes long-term performance in class-incremental learning (CIL), this paper proposes a transductive weight optimization method operating directly in parameter space. The core innovation is the first introduction of a dual-level weight merging mechanism—operating both across tasks and within each task—coupled with bounded update constraints. Without altering the network architecture, loss function, or introducing auxiliary modules, the method jointly optimizes old-knowledge retention and new-class learning. Built upon weight averaging, it explicitly models parameter evolution trajectories to enhance training stability. Evaluated on standard CIL benchmarks—including CIFAR-100, ImageNet-100, and Tiny-ImageNet—the approach consistently surpasses existing state-of-the-art methods, achieving average accuracy gains of 3.2–5.7 percentage points. Notably, its advantage grows with longer task sequences, demonstrating superior scalability and robustness.
📝 Abstract
We present a novel training approach, named Merge-and-Bound (M&B) for Class Incremental Learning (CIL), which directly manipulates model weights in the parameter space for optimization. Our algorithm involves two types of weight merging: inter-task weight merging and intra-task weight merging. Inter-task weight merging unifies previous models by averaging the weights of models from all previous stages. On the other hand, intra-task weight merging facilitates the learning of current task by combining the model parameters within current stage. For reliable weight merging, we also propose a bounded update technique that aims to optimize the target model with minimal cumulative updates and preserve knowledge from previous tasks; this strategy reveals that it is possible to effectively obtain new models near old ones, reducing catastrophic forgetting. M&B is seamlessly integrated into existing CIL methods without modifying architecture components or revising learning objectives. We extensively evaluate our algorithm on standard CIL benchmarks and demonstrate superior performance compared to state-of-the-art methods.