🤖 AI Summary
Existing knowledge editing methods focus on token-level likelihood optimization, causing updated knowledge to reside in the latent space as isolated residual components—thereby compromising semantic coherence and disrupting natural reasoning pathways. To address this, we propose a semantic-level knowledge editing framework that introduces the novel concept of “semantic anchors.” Our approach integrates latent-space alignment loss, residual flow analysis, and contrastive learning to achieve deep integration of newly injected knowledge with the model’s preexisting knowledge structure—without requiring full model retraining. Crucially, edited knowledge is embedded directly into the model’s intrinsic reasoning paths, significantly improving reasoning consistency and semantic plausibility. Extensive experiments demonstrate that our method consistently outperforms state-of-the-art baselines across multiple knowledge editing benchmarks, with particularly pronounced gains in complex multi-step reasoning and zero-shot generalization scenarios.
📝 Abstract
Large Language Models store extensive factual knowledge acquired during large-scale pre-training. However, this knowledge is inherently static, reflecting only the state of the world at the time of training. Knowledge editing has emerged as a promising solution for updating outdated or incorrect facts without full retraining. However, most existing locate-and-edit methods primarily focus on token-level likelihood optimization without addressing semantic coherence. Our analysis reveals that such edited knowledge is often encoded as isolated residual streams in the model's latent space, distinct from pre-existing knowledge and bypassing natural reasoning process. To address this, we propose extsc{Steam}, a semantic-level knowledge editing framework that enhances integration of updated knowledge into the model's knowledge structure. extsc{Steam} first identifies target representations as semantic anchors for the updated factual association, then guides the internal representation of the edited fact towards these anchors through an alignment loss during optimization. Experimental results demonstrate that extsc{Steam} improves model's ability to reason with edited knowledge and enhances semantic coherence, underscoring the importance of latent-space alignment for reliable and coherent knowledge editing. The code is available at https://github.com/GY-Jeong/STEAM.