Detecting AI Hallucinations in Finance: An Information-Theoretic Method Cuts Hallucination Rate by 92%

πŸ“… 2025-12-02
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address the safety risk of hallucinated, unsupported responses from large language models (LLMs) in finance, this paper proposes ECLIPSEβ€”a novel framework that formally characterizes hallucination as a mismatch between semantic entropy and evidence capacity, proving its objective function is strictly convex with a unique stable optimum. Methodologically, ECLIPSE innovatively estimates semantic entropy via multi-sample clustering and quantifies model reliance on retrieved evidence through token-level perplexity decomposition. A key empirical finding is that token-level log-probability uncertainty serves as a decisive signal for hallucination detection. Evaluated on a financial question-answering benchmark, ECLIPSE achieves 0.89 ROC AUC and 0.90 mean precision, significantly outperforming all baselines. Ablation studies confirm that performance gains stem directly from calibrated token-level probability modeling, rather than architectural or retrieval enhancements.

Technology Category

Application Category

πŸ“ Abstract
Large language models (LLMs) produce fluent but unsupported answers - hallucinations - limiting safe deployment in high-stakes domains. We propose ECLIPSE, a framework that treats hallucination as a mismatch between a model's semantic entropy and the capacity of available evidence. We combine entropy estimation via multi-sample clustering with a novel perplexity decomposition that measures how models use retrieved evidence. We prove that under mild conditions, the resulting entropy-capacity objective is strictly convex with a unique stable optimum. We evaluate on a controlled financial question answering dataset with GPT-3.5-turbo (n=200 balanced samples with synthetic hallucinations), where ECLIPSE achieves ROC AUC of 0.89 and average precision of 0.90, substantially outperforming a semantic entropy-only baseline (AUC 0.50). A controlled ablation with Claude-3-Haiku, which lacks token-level log probabilities, shows AUC dropping to 0.59 with coefficient magnitudes decreasing by 95% - demonstrating that ECLIPSE is a logprob-native mechanism whose effectiveness depends on calibrated token-level uncertainties. The perplexity decomposition features exhibit the largest learned coefficients, confirming that evidence utilization is central to hallucination detection. We position this work as a controlled mechanism study; broader validation across domains and naturally occurring hallucinations remains future work.
Problem

Research questions and friction points this paper is trying to address.

Detect AI hallucinations in financial domain using information theory
Reduce hallucination rate by 92% through entropy-capacity mismatch analysis
Measure how language models utilize retrieved evidence against semantic entropy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses semantic entropy and evidence capacity mismatch detection
Combines multi-sample clustering with perplexity decomposition
Relies on calibrated token-level uncertainties for effectiveness
πŸ”Ž Similar Papers
No similar papers found.