DiverseAgentEntropy: Quantifying Black-Box LLM Uncertainty through Diverse Perspectives and Multi-Agent Interaction

📅 2024-12-12
🏛️ arXiv.org
📈 Citations: 3
Influential: 0
📄 PDF
🤖 AI Summary
Quantifying factual uncertainty in black-box large language models (LLMs) remains challenging due to susceptibility to contextual bias, leading to inconsistent responses across paraphrased queries. Method: We propose a novel multi-agent collaborative uncertainty quantification paradigm: diverse perspectives are generated via paraphrased questioning, and response consistency is measured using entropy over agent outputs; this entropy then triggers an interpretable, active refusal mechanism. Contribution/Results: This work is the first to integrate multi-agent interaction and perspective diversity into LLM uncertainty modeling, uncovering the “knowing-but-unstable” phenomenon—where models possess factual knowledge yet exhibit unstable retrieval across viewpoints. Experiments demonstrate significant improvements over traditional self-consistency baselines in reliability prediction accuracy and hallucination detection. Empirical analysis further reveals pervasive cross-perspective inconsistency in factual retrieval among mainstream LLMs.

Technology Category

Application Category

📝 Abstract
Quantifying the uncertainty in the factual parametric knowledge of Large Language Models (LLMs), especially in a black-box setting, poses a significant challenge. Existing methods, which gauge a model's uncertainty through evaluating self-consistency in responses to the original query, do not always capture true uncertainty. Models might respond consistently to the origin query with a wrong answer, yet respond correctly to varied questions from different perspectives about the same query, and vice versa. In this paper, we propose a novel method, DiverseAgentEntropy, for evaluating a model's uncertainty using multi-agent interaction under the assumption that if a model is certain, it should consistently recall the answer to the original query across a diverse collection of questions about the same original query. We further implement an abstention policy to withhold responses when uncertainty is high. Our method offers a more accurate prediction of the model's reliability and further detects hallucinations, outperforming other self-consistency-based methods. Additionally, it demonstrates that existing models often fail to consistently retrieve the correct answer to the same query under diverse varied questions even when knowing the correct answer.
Problem

Research questions and friction points this paper is trying to address.

Quantifying uncertainty in black-box LLMs for reliable responses
Addressing misleading self-consistency methods in uncertainty estimation
Improving hallucination detection through multi-agent query variations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent interaction estimates black-box model uncertainty
Diverse query variations improve uncertainty assessment accuracy
Novel method outperforms self-consistency based techniques
🔎 Similar Papers
No similar papers found.