CoME: An Unlearning-based Approach to Conflict-free Model Editing

📅 2025-02-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Outdated knowledge固化 during large language model (LLM) pretraining causes knowledge conflicts during model editing, severely undermining update accuracy. To address this, we propose the first selective-forgetting-based, conflict-free model editing framework—introducing model unlearning mechanisms systematically into the editing pipeline for the first time—to decouple knowledge updating from language capability retention. Our method integrates gradient-controlled parameter-level forgetting, counterfactual supervision signals, and structured evaluation on Counterfact and ZsRE benchmarks. Experiments on GPT-J and LLaMA-3 demonstrate a 12.7% improvement in editing accuracy, significantly enhanced generalization consistency across unseen facts, and no degradation in generation quality. The core contribution lies in mitigating knowledge conflicts at their root, establishing a novel paradigm for reliable, interpretable, and controllable knowledge updates in LLMs.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) often retain outdated or incorrect information from pre-training, which undermines their reliability. While model editing methods have been developed to address such errors without full re-training, they frequently suffer from knowledge conflicts, where outdated information interferes with new knowledge. In this work, we propose Conflict-free Model Editing (CoME), a novel framework that enhances the accuracy of knowledge updates in LLMs by selectively removing outdated knowledge. CoME leverages unlearning to mitigate knowledge interference, allowing new information to be integrated without compromising relevant linguistic features. Through experiments on GPT-J and LLaMA-3 using Counterfact and ZsRE datasets, we demonstrate that CoME improves both editing accuracy and model reliability when applied to existing editing methods. Our results highlight that the targeted removal of outdated knowledge is crucial for enhancing model editing effectiveness and maintaining the model's generative performance.
Problem

Research questions and friction points this paper is trying to address.

Addresses outdated information in LLMs
Reduces knowledge conflicts during model edits
Enhances accuracy and reliability of model updates
Innovation

Methods, ideas, or system contributions that make the work stand out.

Unlearning-based model editing
Conflict-free knowledge integration
Selective outdated knowledge removal
🔎 Similar Papers
No similar papers found.