LLM Hallucination Detection: A Fast Fourier Transform Method Based on Hidden Layer Temporal Signals

📅 2025-09-16

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

Large language models (LLMs) pose decision risks in reliability-critical applications due to hallucination, yet existing detection methods suffer from limited external knowledge coverage or reliance on static hidden-state representations, resulting in poor robustness and generalization. Method: We propose the first hallucination detection framework based on dynamic modeling of temporal signals across transformer hidden layers. Our approach innovatively incorporates frequency-domain analysis—using Fast Fourier Transform (FFT) to extract non-DC dominant frequency components from hidden activation sequences—and integrates autoregressive generation to identify optimal observation points, thereby establishing an inference-process-driven detection paradigm. Crucially, it operates solely on internal model signals, requiring no external knowledge or fine-tuning. Contribution/Results: Evaluated on benchmarks including TruthfulQA, our method outperforms state-of-the-art approaches by over 10 percentage points, demonstrating substantial improvements in detection accuracy, robustness against distribution shifts, and practical deployability.

Technology Category

Application Category

📝 Abstract

Hallucination remains a critical barrier for deploying large language models (LLMs) in reliability-sensitive applications. Existing detection methods largely fall into two categories: factuality checking, which is fundamentally constrained by external knowledge coverage, and static hidden-state analysis, that fails to capture deviations in reasoning dynamics. As a result, their effectiveness and robustness remain limited. We propose HSAD (Hidden Signal Analysis-based Detection), a novel hallucination detection framework that models the temporal dynamics of hidden representations during autoregressive generation. HSAD constructs hidden-layer signals by sampling activations across layers, applies Fast Fourier Transform (FFT) to obtain frequency-domain representations, and extracts the strongest non-DC frequency component as spectral features. Furthermore, by leveraging the autoregressive nature of LLMs, HSAD identifies optimal observation points for effective and reliable detection. Across multiple benchmarks, including TruthfulQA, HSAD achieves over 10 percentage points improvement compared to prior state-of-the-art methods. By integrating reasoning-process modeling with frequency-domain analysis, HSAD establishes a new paradigm for robust hallucination detection in LLMs.

Problem

Research questions and friction points this paper is trying to address.

Detecting hallucinations in large language models

Overcoming limitations of fact-checking and static analysis

Analyzing temporal dynamics via frequency-domain representations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fast Fourier Transform on hidden signals

Spectral features from non-DC frequencies

Optimal observation points during autoregression

🔎 Similar Papers

No similar papers found.