Localizing Factual Inconsistencies in Attributable Text Generation

📅 2024-10-09
🏛️ arXiv.org
📈 Citations: 4
Influential: 0
📄 PDF
🤖 AI Summary
Precise localization of factual inconsistencies in attributable text generation remains challenging. Method: We propose QASemConsistency—a framework grounded in Neo-Davidsonian semantics that decomposes generated text into predicate-argument-level question-answer pairs, establishing the first interpretable, fine-grained formalization of consistency. It then employs supervised entailment models and open-source large language models to systematically compare each QA pair against trusted source texts for pinpoint error localization. Contribution/Results: The framework unifies high-agreement human annotation (Cohen’s κ > 0.7) and automated detection within a single modeling paradigm. Experiments demonstrate significant improvements in factual error localization accuracy and validate the effectiveness of multiple automated detection strategies, thereby introducing a novel paradigm for factuality assessment in attributable generation.

Technology Category

Application Category

📝 Abstract
There has been an increasing interest in detecting hallucinations in model-generated texts, both manually and automatically, at varying levels of granularity. However, most existing methods fail to precisely pinpoint the errors. In this work, we introduce QASemConsistency, a new formalism for localizing factual inconsistencies in attributable text generation, at a fine-grained level. Drawing inspiration from Neo-Davidsonian formal semantics, we propose decomposing the generated text into minimal predicate-argument level propositions, expressed as simple question-answer (QA) pairs, and assess whether each individual QA pair is supported by a trusted reference text. As each QA pair corresponds to a single semantic relation between a predicate and an argument, QASemConsistency effectively localizes the unsupported information. We first demonstrate the effectiveness of the QASemConsistency methodology for human annotation, by collecting crowdsourced annotations of granular consistency errors, while achieving a substantial inter-annotator agreement ($kappa>0.7)$. Then, we implement several methods for automatically detecting localized factual inconsistencies, with both supervised entailment models and open-source LLMs.
Problem

Research questions and friction points this paper is trying to address.

Localizing factual inconsistencies in attributable text generation
Pinpointing errors at fine-grained semantic level
Assessing predicate-argument propositions against reference texts
Innovation

Methods, ideas, or system contributions that make the work stand out.

Decomposing text into predicate-argument propositions
Expressing propositions as simple question-answer pairs
Assessing each QA pair against reference texts
🔎 Similar Papers
No similar papers found.