When to Trust Context: Self-Reflective Debates for Context Reliability

📅 2025-06-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) frequently generate factual errors and hallucinations when parametric knowledge conflicts with input context, necessitating a reliable mechanism for context-aware reliability assessment. To address this, we propose SR-DCR, a lightweight self-reflective debate framework featuring token-level confidence modeling and an asymmetric multi-agent debate integration. SR-DCR employs a role-separated architecture comprising a context-free critic, a context-dependent advocate, and an independent judge who never accesses the original input. Reliability is assessed objectively and robustly via joint verdict and confidence scoring. Evaluated on the ClashEval benchmark, SR-DCR achieves a +12.7% improvement in robustness against misleading contexts while maintaining high accuracy (>94%) on trustworthy contexts—outperforming both classical debate and pure confidence-based baselines at minimal computational overhead.

Technology Category

Application Category

📝 Abstract
Large language models frequently encounter conflicts between their parametric knowledge and contextual input, often resulting in factual inconsistencies or hallucinations. We propose Self-Reflective Debate for Contextual Reliability (SR-DCR), a lightweight framework that integrates token-level self-confidence with an asymmetric multi-agent debate to adjudicate such conflicts. A critic, deprived of context, challenges a defender who argues from the given passage; a judge model evaluates the debate and determines the context's reliability. The final answer is selected by combining the verdict with model confidence. Experiments on the ClashEval benchmark demonstrate that SR-DCR consistently enhances robustness to misleading context while maintaining accuracy on trustworthy inputs, outperforming both classical debate and confidence-only baselines with minimal computational overhead. The code is available at https://github.com/smiles724/Self-Reflective-Debates.
Problem

Research questions and friction points this paper is trying to address.

Resolving conflicts between model knowledge and context
Determining reliability of contextual input via debate
Improving robustness to misleading context efficiently
Innovation

Methods, ideas, or system contributions that make the work stand out.

Token-level self-confidence integration
Asymmetric multi-agent debate framework
Judge model evaluates context reliability