In-Context Editing: Learning Knowledge from Self-Induced Distributions

๐Ÿ“… 2024-06-17
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 4
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address overfitting, poor generalization, and unnatural generation in knowledge editing for large language models (LLMs), this paper proposes Consistent Contextual Editing (ICE). ICE reformulates knowledge editing as a contextual output distribution alignment taskโ€”departing from conventional pointwise label correction. It achieves fine-tuning-free, overfitting-resistant knowledge injection via gradient-driven distribution matching and self-induced contextual distribution modeling, without model retraining. The method inherently supports continual editing while preserving accuracy, locality, generalization, and linguistic fluency. Evaluated on four standard knowledge editing benchmarks, ICE consistently outperforms state-of-the-art baselines, significantly improving editing robustness and generation naturalness. Experimental results validate ICE as a new paradigm for efficient, trustworthy, and sustainable LLM knowledge updating.

Technology Category

Application Category

๐Ÿ“ Abstract
In scenarios where language models must incorporate new information efficiently without extensive retraining, traditional fine-tuning methods are prone to overfitting, degraded generalization, and unnatural language generation. To address these limitations, we introduce Consistent In-Context Editing (ICE), a novel approach leveraging the model's in-context learning capability to optimize toward a contextual distribution rather than a one-hot target. ICE introduces a simple yet effective optimization framework for the model to internalize new knowledge by aligning its output distributions with and without additional context. This method enhances the robustness and effectiveness of gradient-based tuning methods, preventing overfitting and preserving the model's integrity. We analyze ICE across four critical aspects of knowledge editing: accuracy, locality, generalization, and linguistic quality, demonstrating its advantages. Experimental results confirm the effectiveness of ICE and demonstrate its potential for continual editing, ensuring that the integrity of the model is preserved while updating information.
Problem

Research questions and friction points this paper is trying to address.

Language Model
Overlearning
Text Generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

ICE method
Fast Learning
Overfitting Prevention
Siyuan Qi
Siyuan Qi
Gyges Labs
Machine LearningComputer Vision
B
Bangcheng Yang
State Key Laboratory of General Artificial Intelligence, BIGAI
K
Kailin Jiang
State Key Laboratory of General Artificial Intelligence, BIGAI, University of Science and Technology of China
Xiaobo Wang
Xiaobo Wang
University of Science and Technology of China
Natural Language Processing
J
Jiaqi Li
State Key Laboratory of General Artificial Intelligence, BIGAI
Yifan Zhong
Yifan Zhong
Peking University
VLA ModelsDexterous ManipulationReinforcement Learning
Y
Yaodong Yang
State Key Laboratory of General Artificial Intelligence, BIGAI, Peking University
Z
Zilong Zheng
State Key Laboratory of General Artificial Intelligence, BIGAI