SEDEG:Sequential Enhancement of Decoder and Encoder's Generality for Class Incremental Learning with Small Memory

📅 2025-08-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address catastrophic forgetting in few-shot incremental learning under memory-constrained settings, this paper proposes a two-stage collaborative optimization framework—the first to jointly enhance the generalization capability of both encoder and decoder in vision Transformers. In Stage I, hierarchical feature enhancement and ensemble encoder training improve representation robustness. In Stage II, balanced knowledge distillation—simultaneously preserving old-task logits and intermediate features—mitigates representational imbalance between old and new knowledge. The method introduces no additional parameters or large replay buffers. It achieves state-of-the-art performance on three standard benchmarks, significantly outperforming existing memory-efficient incremental learning approaches. Ablation studies validate the effectiveness of each component and demonstrate clear synergistic gains from their integration.

Technology Category

Application Category

📝 Abstract
In incremental learning, enhancing the generality of knowledge is crucial for adapting to dynamic data inputs. It can develop generalized representations or more balanced decision boundaries, preventing the degradation of long-term knowledge over time and thus mitigating catastrophic forgetting. Some emerging incremental learning methods adopt an encoder-decoder architecture and have achieved promising results. In the encoder-decoder achitecture, improving the generalization capabilities of both the encoder and decoder is critical, as it helps preserve previously learned knowledge while ensuring adaptability and robustness to new, diverse data inputs. However, many existing continual methods focus solely on enhancing one of the two components, which limits their effectiveness in mitigating catastrophic forgetting. And these methods perform even worse in small-memory scenarios, where only a limited number of historical samples can be stored. To mitigate this limitation, we introduces SEDEG, a two-stage training framework for vision transformers (ViT), focusing on sequentially improving the generality of both Decoder and Encoder. Initially, SEDEG trains an ensembled encoder through feature boosting to learn generalized representations, which subsequently enhance the decoder's generality and balance the classifier. The next stage involves using knowledge distillation (KD) strategies to compress the ensembled encoder and develop a new, more generalized encoder. This involves using a balanced KD approach and feature KD for effective knowledge transfer. Extensive experiments on three benchmark datasets show SEDEG's superior performance, and ablation studies confirm the efficacy of its components. The code is available at https://github.com/ShaolingPu/CIL.
Problem

Research questions and friction points this paper is trying to address.

Enhancing encoder-decoder generality for incremental learning
Mitigating catastrophic forgetting in small-memory scenarios
Balancing knowledge transfer via two-stage ViT training
Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage ViT training for incremental learning
Feature boosting for generalized encoder-decoder
Balanced knowledge distillation for small memory
Hongyang Chen
Hongyang Chen
SUN YAT-SEN UNIVERSITY
SDNCloud ComputingMicroserviceAIOps
S
Shaoling Pu
Zhejiang Lab, Hangzhou, China
L
Lingyu Zheng
Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, China
Z
Zhongwu Sun
Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, China