AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge

📅 2024-09-11

🏛️ arXiv.org

📈 Citations: 5

✨ Influential: 0

career value

156K/year

🤖 AI Summary

Large language models (LLMs) suffer from performance degradation during inference due to conflicts between context-provided and parametric knowledge. Existing test-time contrastive methods employ static thresholds to detect such conflicts, leading to over-correction in conflict-free cases. This paper proposes an instance-level adaptive decoding mechanism that introduces a fine-grained, Jensen–Shannon divergence-based conflict metric—enabling dynamic, sample-aware trade-offs between contextual and parametric knowledge. The method requires no model fine-tuning and is compatible with arbitrary LLMs. Extensive experiments across four LLMs, six question-answering (QA) datasets, and three summarization benchmarks demonstrate consistent improvements: average QA accuracy increases by 14.21 percentage points (absolute), and summary factual consistency improves by 6.19 AlignScore, with zero performance degradation on conflict-free samples.

Technology Category

Application Category

📝 Abstract

Knowledge conflict arises from discrepancies between information in the context of a large language model (LLM) and the knowledge stored in its parameters. This can hurt performance when using standard decoding techniques, which tend to ignore the context. Existing test-time contrastive methods seek to address this by comparing the LLM's output distribution with and without the context and adjust the model according to the contrast between them. However, we find that these methods frequently misjudge the degree of conflict and struggle to handle instances that vary in their amount of conflict, with static methods over-adjusting when conflict is absent. We propose a fine-grained, instance-level approach called AdaCAD, which dynamically infers the weight of adjustment based on the degree of conflict, as measured by the Jensen-Shannon divergence between distributions representing contextual and parametric knowledge. Across four LLMs, six question-answering (QA) and three summarization datasets, we demonstrate that ADACAD consistently outperforms other decoding baselines with average QA accuracy gains of 14.21% (absolute) over a static contrastive baseline, and improves the factuality of summaries by 6.19 (AlignScore). Lastly, we show that while contrastive baselines hurt performance when conflict is absent, ADACAD mitigates these losses, making it more applicable to real-world datasets in which some examples have conflict and others do not.

Problem

Research questions and friction points this paper is trying to address.

Addresses knowledge conflict in LLMs between context and parametric knowledge

Improves decoding by dynamically adjusting weights based on conflict degree

Enhances QA accuracy and summary factuality across diverse datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamically adjusts decoding based on conflict degree

Uses Jensen-Shannon divergence to measure knowledge conflict

Improves QA accuracy and summary factuality significantly

🔎 Similar Papers

No similar papers found.