Knowledge Editing for Multi-Hop Question Answering Using Semantic Analysis

📅 2025-07-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Multi-hop question answering (MQA) suffers from reasoning chain inconsistency when performing knowledge editing. Method: This paper proposes CHECK, a semantic-driven knowledge editing framework that introduces a novel “compiler analogy” mechanism—treating reasoning chains as parseable and optimizable intermediate representations. CHECK identifies logical flaws via semantic analysis and rectifies erroneous reasoning paths through logic-constrained optimization and high-temperature resampling, all without model fine-tuning. Contribution/Results: CHECK maintains lightweight deployment while significantly improving reasoning consistency and editability under knowledge updates. Evaluated on four standard MQA benchmarks, it achieves an average accuracy gain of 22.8% over five state-of-the-art methods, demonstrating superior robustness and effectiveness in knowledge editing scenarios.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) require lightweight avenues of updating stored information that has fallen out of date. Knowledge Editing (KE) approaches have been successful in updating model knowledge for simple factual queries but struggle with handling tasks that require compositional reasoning such as multi-hop question answering (MQA). We observe that existing knowledge editors leverage decompositional techniques that result in illogical reasoning processes. In this paper, we propose a knowledge editor for MQA based on semantic analysis called CHECK. Our framework is based on insights from an analogy between compilers and reasoning using LLMs. Similar to how source code is first compiled before being executed, we propose to semantically analyze reasoning chains before executing the chains to answer questions. Reasoning chains with semantic errors are revised to ensure consistency through logic optimization and re-prompting the LLM model at a higher temperature. We evaluate the effectiveness of CHECK against five state-of-the-art frameworks on four datasets and achieve an average 22.8% improved MQA accuracy.
Problem

Research questions and friction points this paper is trying to address.

Updating outdated knowledge in Large Language Models efficiently
Improving multi-hop question answering via semantic analysis
Correcting illogical reasoning chains in knowledge editing
Innovation

Methods, ideas, or system contributions that make the work stand out.

Semantic analysis for multi-hop QA
Compiler-inspired reasoning chain optimization
Temperature-based re-prompting for consistency
🔎 Similar Papers
No similar papers found.