🤖 AI Summary
This work addresses the challenges of high computational cost, interference between old and new knowledge, and poor stability in large language model (LLM) editing. To overcome these issues, the authors propose a Hierarchical Orthogonal Residual Propagation (HORSE) mechanism that integrates gradient orthogonality constraints, residual propagation strategies, and Fisher information matrix optimization. This approach effectively suppresses noise-induced gradient interference, thereby enhancing both the precision and stability of knowledge editing. Extensive experiments across multiple mainstream LLMs and two standard benchmark datasets demonstrate that HORSE enables efficient large-scale knowledge editing while maintaining high accuracy and strong robustness.
📝 Abstract
Large language models (LLMs) exhibit exceptional performance across various domains, yet they face critical safety concerns. Model editing has emerged as an effective approach to mitigate these issues. Existing model editing methods often focus on optimizing an information matrix that blends new and old knowledge. While effective, these approaches can be computationally expensive and may cause conflicts. In contrast, we shift our attention to Hierarchical Orthogonal Residual SprEad of the information matrix, which reduces noisy gradients and enables more stable edits from a different perspective. We demonstrate the effectiveness of our method HORSE through a clear theoretical comparison with several popular methods and extensive experiments conducted on two datasets across multiple LLMs. The results show that HORSE maintains precise massive editing across diverse scenarios. The code is available at https://github.com/XiaojieGu/HORSE