SCAN: Sparse Circuit Anchor Interpretable Neuron for Lifelong Knowledge Editing

📅 2026-03-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the susceptibility of large language models to catastrophic forgetting and model collapse during continual knowledge editing. The authors propose a mechanism-aware, precise editing framework that, for the first time, integrates sparse circuits with interpretable neurons. By leveraging a sparse transcoder to construct knowledge circuits, the method identifies and manipulates specific functional neurons, enabling fine-grained knowledge updates with minimal interference to unrelated model capabilities. Designed to support lifelong learning, the approach maintains strong performance on standard benchmarks such as MMLU and GSM8K even after 3,000 consecutive edits on Gemma2, Qwen3, and Llama3.1, significantly outperforming existing techniques.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) often suffer from catastrophic forgetting and collapse during sequential knowledge editing. This vulnerability stems from the prevailing dense editing paradigm, which treats models as black boxes and relies on coarse-grained parameter interventions that inevitably disrupt preserved knowledge. To address this, we propose SCAN (a sparse editing framework based on Sparse Circuit Anchored Neuron) which transforms editing into a mechanism-aware manipulation by constructing a knowledge circuit via Sparse Transcoders. Experiments on Gemma2, Qwen3, and Llama3.1 across CounterFact, ZsRE and WikiFactDiff demonstrate that SCAN achieves a superior performance, maintaining model integrity on benchmarks like MMLU and GSM8K even after 3,000 sequential edits, whereas other existing methods deteriorate progressively as editing accumulates, eventually resulting in model collapse.
Problem

Research questions and friction points this paper is trying to address.

catastrophic forgetting
model collapse
sequential knowledge editing
large language models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sparse Editing
Knowledge Circuit
Catastrophic Forgetting
Mechanism-aware Manipulation
Lifelong Knowledge Editing
🔎 Similar Papers
No similar papers found.
Y
Yuhuan Liu
New Laboratory of Pattern Recognition (NLPR), State Key Laboratory of Multimodal Artificial Intelligence Systems (MAIS), Institute of Automation, Chinese Academy of Sciences; Cuiying Honors College, Lanzhou University
Haitian Zhong
Haitian Zhong
Institute of Automation, Chinese Academy of Sciences
Large Language ModelsTrustworthy AIAI for Science
X
Xinyuan Xia
Research Institute of Intelligent Complex Systems, Fudan University
Qiang Liu
Qiang Liu
Institute of Automation, Chinese Academy of Sciences
Data MiningMultimodal LLMsAI for Science
S
Shu Wu
New Laboratory of Pattern Recognition (NLPR), State Key Laboratory of Multimodal Artificial Intelligence Systems (MAIS), Institute of Automation, Chinese Academy of Sciences
L
Liang Wang
New Laboratory of Pattern Recognition (NLPR), State Key Laboratory of Multimodal Artificial Intelligence Systems (MAIS), Institute of Automation, Chinese Academy of Sciences