When to Trust Context: Self-Reflective Debates for Context Reliability

📅 2025-06-06

📈 Citations: 0

✨ Influential: 0

career value

165K/year

🤖 AI Summary

Large language models (LLMs) frequently generate factual errors and hallucinations when parametric knowledge conflicts with input context, necessitating a reliable mechanism for context-aware reliability assessment. To address this, we propose SR-DCR, a lightweight self-reflective debate framework featuring token-level confidence modeling and an asymmetric multi-agent debate integration. SR-DCR employs a role-separated architecture comprising a context-free critic, a context-dependent advocate, and an independent judge who never accesses the original input. Reliability is assessed objectively and robustly via joint verdict and confidence scoring. Evaluated on the ClashEval benchmark, SR-DCR achieves a +12.7% improvement in robustness against misleading contexts while maintaining high accuracy (>94%) on trustworthy contexts—outperforming both classical debate and pure confidence-based baselines at minimal computational overhead.

Technology Category

Application Category

📝 Abstract

Large language models frequently encounter conflicts between their parametric knowledge and contextual input, often resulting in factual inconsistencies or hallucinations. We propose Self-Reflective Debate for Contextual Reliability (SR-DCR), a lightweight framework that integrates token-level self-confidence with an asymmetric multi-agent debate to adjudicate such conflicts. A critic, deprived of context, challenges a defender who argues from the given passage; a judge model evaluates the debate and determines the context's reliability. The final answer is selected by combining the verdict with model confidence. Experiments on the ClashEval benchmark demonstrate that SR-DCR consistently enhances robustness to misleading context while maintaining accuracy on trustworthy inputs, outperforming both classical debate and confidence-only baselines with minimal computational overhead. The code is available at https://github.com/smiles724/Self-Reflective-Debates.

Problem

Research questions and friction points this paper is trying to address.

Resolving conflicts between model knowledge and context

Determining reliability of contextual input via debate

Improving robustness to misleading context efficiently

Innovation

Methods, ideas, or system contributions that make the work stand out.

Token-level self-confidence integration

Asymmetric multi-agent debate framework

Judge model evaluates context reliability

🔎 Similar Papers

Human Delegation Behavior in Human-AI Collaboration: The Effect of Contextual Information