π€ AI Summary
This work addresses the catastrophic forgetting of general capabilities in large language models during sequential knowledge editing, a phenomenon often triggered by parameter updates. The study reveals for the first time that this degradation is closely linked to the collapse of the dominant singular subspace of pre-trained weight matrices. To mitigate this issue, the authors propose REVIVE, a plug-and-play framework that constructs a spectral basis representation via singular value decomposition to explicitly preserve the dominant subspace and filter out harmful perturbations. Experiments demonstrate that REVIVE significantly enhances editing effectiveness across up to 20,000 sequential edits while effectively maintaining the modelβs general capabilities, with consistent validation across multiple mainstream models and benchmark datasets.
π Abstract
Sequential knowledge editing in large language models often causes catastrophic collapse of the model's general abilities, especially for parameter-modifying methods. Existing approaches mitigate this issue through heuristic constraints on parameter updates, yet the mechanisms underlying such degradation remain insufficiently understood. In this work, we present a spectral analysis of sequential knowledge editing and show that a model's general abilities are closely associated with dominant singular directions of pretrained weight matrices. These directions are highly sensitive to perturbations and are progressively disrupted by repeated edits, closely tracking the collapse in both editing efficacy and general performance. Building on this insight, we propose REVIVE, a plug-and-play framework that stabilizes sequential editing by explicitly preserving the dominant singular subspace. REVIVE represents parameter updates in the spectral basis of the original weights and filters components that would interfere with the protected region. Extensive experiments across multiple models and benchmarks show that REVIVE consistently improves editing efficacy while substantially preserving general abilities under long-horizon sequential editing, including extreme settings with up to 20,000 edits.