Exploring Stability-Plasticity Trade-offs for Continual Named Entity Recognition

📅 2025-08-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the fundamental stability–plasticity dilemma in continual named entity recognition (CNER), where models struggle to retain prior knowledge while adapting to new entities. To resolve this, we propose a dual-perspective co-optimization framework jointly regulating representation and parameter spaces. Methodologically: (1) We design representation-dimension-aggregated knowledge distillation to mitigate excessive stability induced by conventional knowledge distillation; (2) we introduce a weight-guided selective fusion mechanism that dynamically balances parameter inheritance from old models and adaptation to new tasks; (3) we adopt a confidence-driven pseudo-labeling strategy to suppress semantic drift in non-entity classes. Evaluated across three benchmark datasets under ten diverse continual learning settings, our approach consistently outperforms existing state-of-the-art methods. It is the first to achieve controllable trade-offs in both representation and parameter spaces, establishing an interpretable and scalable paradigm for CNER continual learning.

Technology Category

Application Category

📝 Abstract
Continual Named Entity Recognition (CNER) is an evolving field that focuses on sequentially updating an existing model to incorporate new entity types. Previous CNER methods primarily utilize Knowledge Distillation (KD) to preserve prior knowledge and overcome catastrophic forgetting, strictly ensuring that the representations of old and new models remain consistent. Consequently, they often impart the model with excessive stability (i.e., retention of old knowledge) but limited plasticity (i.e., acquisition of new knowledge). To address this issue, we propose a Stability-Plasticity Trade-off (SPT) method for CNER that balances these aspects from both representation and weight perspectives. From the representation perspective, we introduce a pooling operation into the original KD, permitting a level of plasticity by consolidating representation dimensions. From the weight perspective, we dynamically merge the weights of old and new models, strengthening old knowledge while maintaining new knowledge. During this fusion, we implement a weight-guided selective mechanism to prioritize significant weights. Moreover, we develop a confidence-based pseudo-labeling approach for the current non-entity type, which predicts entity types using the old model to handle the semantic shift of the non-entity type, a challenge specific to CNER that has largely been ignored by previous methods. Extensive experiments across ten CNER settings on three benchmark datasets demonstrate that our SPT method surpasses previous CNER approaches, highlighting its effectiveness in achieving a suitable stability-plasticity trade-off.
Problem

Research questions and friction points this paper is trying to address.

Balancing stability and plasticity in continual named entity recognition
Addressing catastrophic forgetting while acquiring new entity types
Handling semantic shift of non-entity types in CNER
Innovation

Methods, ideas, or system contributions that make the work stand out.

Stability-Plasticity Trade-off method balances representation and weight
Pooling operation in Knowledge Distillation enhances plasticity
Confidence-based pseudo-labeling handles non-entity type semantic shift
🔎 Similar Papers
No similar papers found.
Duzhen Zhang
Duzhen Zhang
Institute of Automation, Chinese Academy of Sciences
Natural Language ProcessingMultimodalLarge Language ModelsContinual LearningAI4Science
C
Chenxing Li
Tencent, AI Lab, Beijing, China
J
Jiahua Dong
Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, United Arab Emirates
Q
Qi Liu
University of Hong Kong, Hong Kong, China
D
Dong Yu
Tencent, AI Lab, Bellevue, WA 98004 USA