What do Geometric Hallucination Detection Metrics Actually Measure?

📅 2026-02-09
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Existing geometric hallucination detection metrics struggle to distinguish specific hallucination types in the absence of ground truth and are highly sensitive to domain shifts. This work addresses these limitations by constructing a synthetic dataset to systematically evaluate the capacity of various geometric statistics to capture key hallucination attributes—such as output correctness, relevance, and coherence—and reveals that different metrics align with distinct hallucination types. Furthermore, the study proposes a simple yet effective normalization strategy that substantially mitigates the impact of domain shift. Experimental results demonstrate that, under multi-domain settings, the proposed approach improves AUROC by 34 percentage points, significantly enhancing the cross-domain robustness of geometric hallucination detection metrics.

Technology Category

Application Category

📝 Abstract
Hallucination remains a barrier to deploying generative models in high-consequence applications. This is especially true in cases where external ground truth is not readily available to validate model outputs. This situation has motivated the study of geometric signals in the internal state of an LLM that are predictive of hallucination and require limited external knowledge. Given that there are a range of factors that can lead model output to be called a hallucination (e.g., irrelevance vs incoherence), in this paper we ask what specific properties of a hallucination these geometric statistics actually capture. To assess this, we generate a synthetic dataset which varies distinct properties of output associated with hallucination. This includes output correctness, confidence, relevance, coherence, and completeness. We find that different geometric statistics capture different types of hallucinations. Along the way we show that many existing geometric detection methods have substantial sensitivity to shifts in task domain (e.g., math questions vs. history questions). Motivated by this, we introduce a simple normalization method to mitigate the effect of domain shift on geometric statistics, leading to AUROC gains of +34 points in multi-domain settings.
Problem

Research questions and friction points this paper is trying to address.

hallucination
geometric metrics
large language models
domain shift
output properties
Innovation

Methods, ideas, or system contributions that make the work stand out.

geometric hallucination detection
domain shift
synthetic dataset
normalization method
LLM internal representations
🔎 Similar Papers
No similar papers found.
E
Eric Yeats
Pacific Northwest National Laboratory
J
John Buckheit
Pacific Northwest National Laboratory
S
Sarah Scullen
Pacific Northwest National Laboratory
Brendan Kennedy
Brendan Kennedy
Professor of Chemistry, The University of Sydney
CrystallographyInorganic ChemistryStructural Phase Tranitions
L
Loc Truong
Pacific Northwest National Laboratory
Davis Brown
Davis Brown
University of Pennsylvania
deep learning
Bill Kay
Bill Kay
Mathematician
CombinatoricsGraph TheoryInformation Theory
C
Cliff Joslyn
Pacific Northwest National Laboratory
Tegan Emerson
Tegan Emerson
Senior Data Scientist
M
Michael J. Henry
Pacific Northwest National Laboratory
J
John Emanuello
Laboratory for Advanced Cybersecurity Research, National Security Agency
Henry Kvinge
Henry Kvinge
Pacific Northwest National Lab/University of Washington
representation learningadversarial machine learninggeometric deep learningrepresentation theorycombinatorics