Min: Mixture of Noise for Pre-Trained Model-Based Class-Incremental Learning

📅 2025-09-20

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

In class-incremental learning (CIL), lightweight fine-tuning of pretrained backbones often induces parameter drift and degrades generalization. To address this, we propose an information-theoretic guided “beneficial noise” mechanism: task-specific noise is dynamically injected into intermediate feature layers to selectively suppress low-relevance activations, thereby mitigating catastrophic forgetting while preserving discriminative patterns from previous tasks. Our approach requires no architectural modification to the backbone; instead, it introduces a learnable noise embedding module in the feature space, jointly optimized via high-dimensional feature constraints and dynamic noise weighting for effective noise regularization. Evaluated on six standard benchmarks, our method achieves state-of-the-art performance—particularly excelling in fine-grained 50-step incremental settings—demonstrating the modeling efficacy and generalization enhancement enabled by controllable, information-guided noise in continual learning.

Technology Category

Application Category

📝 Abstract

Class Incremental Learning (CIL) aims to continuously learn new categories while retaining the knowledge of old ones. Pre-trained models (PTMs) show promising capabilities in CIL. However, existing approaches that apply lightweight fine-tuning to backbones still induce parameter drift, thereby compromising the generalization capability of pre-trained models. Parameter drift can be conceptualized as a form of noise that obscures critical patterns learned for previous tasks. However, recent researches have shown that noise is not always harmful. For example, the large number of visual patterns learned from pre-training can be easily abused by a single task, and introducing appropriate noise can suppress some low-correlation features, thus leaving a margin for future tasks. To this end, we propose learning beneficial noise for CIL guided by information theory and propose Mixture of Noise (Min), aiming to mitigate the degradation of backbone generalization from adapting new tasks. Specifically, task-specific noise is learned from high-dimension features of new tasks. Then, a set of weights is adjusted dynamically for optimal mixture of different task noise. Finally, Min embeds the beneficial noise into the intermediate features to mask the response of inefficient patterns. Extensive experiments on six benchmark datasets demonstrate that Min achieves state-of-the-art performance in most incremental settings, with particularly outstanding results in 50-steps incremental settings. This shows the significant potential for beneficial noise in continual learning.

Problem

Research questions and friction points this paper is trying to address.

Mitigating parameter drift in pre-trained models during class-incremental learning

Learning beneficial noise to preserve generalization capability for future tasks

Dynamically mixing task-specific noise to mask inefficient feature patterns

Innovation

Methods, ideas, or system contributions that make the work stand out.

Learns task-specific noise from high-dimensional features

Dynamically adjusts weights for optimal noise mixture

Embeds beneficial noise to mask inefficient patterns

🔎 Similar Papers

Semantically-Shifted Incremental Adapter-Tuning is A Continual ViTransformer