🤖 AI Summary
To address the practical need for continual knowledge updating in deployed large language models (LLMs), this paper proposes a lifelong model editing method that requires no retraining, introduces minimal interference, and scales effectively to long sequences. The core innovation is a sparse activation pattern-matching residual memory module: it employs sample-dependent sparse masking and query-aware activation pattern matching for retrieval, thereby achieving edit isolation and semantic generalization; lightweight parameter injection dynamically integrates corrective knowledge into the residual path. Evaluated on LLaMA-3 and Mistral, the method supports over one thousand sequential edits with zero significant catastrophic forgetting. It consistently outperforms state-of-the-art approaches across diverse tasks—including question-answer correction, hallucination suppression, and out-of-distribution generalization—demonstrating robustness, scalability, and efficacy in real-world continual adaptation scenarios.
📝 Abstract
Language models deployed in real-world systems often require post-hoc updates to incorporate new or corrected knowledge. However, editing such models efficiently and reliably - without retraining or forgetting previous information - remains a major challenge. Existing methods for lifelong model editing either compromise generalization, interfere with past edits, or fail to scale to long editing sequences. We propose MEMOIR, a novel scalable framework that injects knowledge through a residual memory, i.e., a dedicated parameter module, while preserving the core capabilities of the pre-trained model. By sparsifying input activations through sample-dependent masks, MEMOIR confines each edit to a distinct subset of the memory parameters, minimizing interference among edits. At inference, it identifies relevant edits by comparing the sparse activation patterns of new queries to those stored during editing. This enables generalization to rephrased queries by activating only the relevant knowledge while suppressing unnecessary memory activation for unrelated prompts. Experiments on question answering, hallucination correction, and out-of-distribution generalization benchmarks across LLaMA-3 and Mistral demonstrate that MEMOIR achieves state-of-the-art performance across reliability, generalization, and locality metrics, scaling to thousands of sequential edits with minimal forgetting.