Efficient Hallucination Detection: Adaptive Bayesian Estimation of Semantic Entropy with Guided Semantic Exploration

📅 2026-03-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of hallucination-induced factual errors in large language models, which existing detection methods struggle to balance efficiently and accurately due to their reliance on fixed sampling strategies. To overcome this limitation, the authors propose an adaptive Bayesian semantic entropy estimation framework that integrates hierarchical Bayesian modeling with guided semantic exploration. The framework incorporates a variance-threshold-based dynamic stopping criterion and a perturbation-driven importance sampling strategy to enable efficient and adaptive perception of semantic uncertainty. Evaluated on four question-answering benchmarks, the method achieves an average 12.6% improvement in AUROC under identical sampling budgets compared to current approaches. Notably, in low-budget settings, it attains comparable detection performance using only approximately half the number of samples required by existing methods.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) have achieved remarkable success in various natural language processing tasks, yet they remain prone to generating factually incorrect outputs known as hallucinations. While recent approaches have shown promise for hallucination detection by repeatedly sampling from LLMs and quantifying the semantic inconsistency among the generated responses, they rely on fixed sampling budgets that fail to adapt to query complexity, resulting in computational inefficiency. We propose an Adaptive Bayesian Estimation framework for Semantic Entropy with Guided Semantic Exploration, which dynamically adjusts sampling requirements based on observed uncertainty. Our approach employs a hierarchical Bayesian framework to model the semantic distribution, enabling dynamic control of sampling iterations through variance-based thresholds that terminate generation once sufficient certainty is achieved. We also develop a perturbation-based importance sampling strategy to systematically explore the semantic space. Extensive experiments on four QA datasets demonstrate that our method achieves superior hallucination detection performance with significant efficiency gains. In low-budget scenarios, our approach requires about 50% fewer samples to achieve comparable detection performance to existing methods, while delivers an average AUROC improvement of 12.6% under the same sampling budget.
Problem

Research questions and friction points this paper is trying to address.

hallucination detection
large language models
semantic entropy
sampling efficiency
computational inefficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive Bayesian Estimation
Semantic Entropy
Hallucination Detection
Guided Semantic Exploration
Importance Sampling
🔎 Similar Papers
No similar papers found.
Qiyao Sun
Qiyao Sun
QueenMary University of London
AI Scientist
X
Xingming Li
National University of Defense Technology, Changsha, China
X
Xixiang He
National University of Defense Technology, Changsha, China
A
Ao Cheng
National University of Defense Technology, Changsha, China
X
Xuanyu Ji
National University of Defense Technology, Changsha, China
H
Hailun Lu
Intelligent Game and Decision Lab, Beijing, China
R
Runke Huang
The Chinese University of Hong Kong, Shenzhen, China
Qingyong Hu
Qingyong Hu
Ph.D. of Computer Science, University of Oxford
3D VisionPhotogrammetryPoint Cloud ProcessingAutonomous Driving