🤖 AI Summary
To address inaccurate task identification, severe catastrophic forgetting, and intra-/inter-task distribution shifts in continual relation extraction (CRE), this paper proposes a rehearsal-free adaptive prompt learning framework. Methodologically: (i) it introduces a task-specific prompt pool coupled with a training-agnostic task identification mechanism; (ii) it incorporates label semantic descriptions to enhance discriminative capability; and (iii) it employs generative knowledge consolidation to implicitly preserve prior knowledge within shared parameters. Innovatively integrating prefix-tuning with Mixture-of-Experts (MoE) principles, the framework achieves lightweight, privacy-preserving, and memory-efficient continual learning. Extensive experiments on multiple CRE benchmarks demonstrate substantial improvements over existing prompt-based and rehearsal-based methods—achieving high accuracy, strong resistance to forgetting, and zero explicit storage of historical samples.
📝 Abstract
Memory-based approaches have shown strong performance in Continual Relation Extraction (CRE). However, storing examples from previous tasks increases memory usage and raises privacy concerns. Recently, prompt-based methods have emerged as a promising alternative, as they do not rely on storing past samples. Despite this progress, current prompt-based techniques face several core challenges in CRE, particularly in accurately identifying task identities and mitigating catastrophic forgetting. Existing prompt selection strategies often suffer from inaccuracies, lack robust mechanisms to prevent forgetting in shared parameters, and struggle to handle both cross-task and within-task variations. In this paper, we propose WAVE++, a novel approach inspired by the connection between prefix-tuning and mixture of experts. Specifically, we introduce task-specific prompt pools that enhance flexibility and adaptability across diverse tasks while avoiding boundary-spanning risks; this design more effectively captures variations within each task and across tasks. To further refine relation classification, we incorporate label descriptions that provide richer, more global context, enabling the model to better distinguish among different relations. We also propose a training-free mechanism to improve task prediction during inference. Moreover, we integrate a generative model to consolidate prior knowledge within the shared parameters, thereby removing the need for explicit data storage. Extensive experiments demonstrate that WAVE++ outperforms state-of-the-art prompt-based and rehearsal-based methods, offering a more robust solution for continual relation extraction. Our code is publicly available at https://github.com/PiDinosauR2804/WAVE-CRE-PLUS-PLUS.