Improve Decoding Factuality by Token-wise Cross Layer Entropy of Large Language Models

📅 2025-02-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) frequently generate hallucinated content despite possessing correct knowledge. To address this, we propose END—a training-free, zero-parameter-update, token-level decoding method that establishes, for the first time, a principled link between cross-layer hidden-state entropy and token-level factuality. END quantifies the variability in probability distributions of candidate tokens across transformer layers, computes a token-wise factuality score based on this entropy, and dynamically reweights logits to calibrate the output distribution. Fully unsupervised, END requires no labeled data or model fine-tuning. Evaluated on multiple hallucination detection and open-domain question answering benchmarks, END significantly improves factual consistency and information richness of generated text while preserving original QA accuracy. Our approach introduces an efficient, lightweight paradigm for factuality-aware decoding—enabling robust hallucination mitigation without architectural or parametric modifications.

Technology Category

Application Category

📝 Abstract
Despite their impressive capacities, Large language models (LLMs) often struggle with the hallucination issue of generating inaccurate or fabricated content even when they possess correct knowledge. In this paper, we extend the exploration of the correlation between hidden-state prediction changes and output factuality into a deeper, token-wise level. Based on the insights , we propose cross-layer Entropy eNhanced Decoding (END), a decoding method that mitigates hallucinations without requiring extra training. END leverages inner probability changes across layers to individually quantify the factual knowledge required for each candidate token, and adjusts the final predicting distribution to prioritize tokens with higher factuality. Experiments on both hallucination and QA benchmarks demonstrate that END significantly enhances the truthfulness and informativeness of generated content while maintaining robust QA accuracy. Moreover, our work provides a deeper perspective on understanding the correlations between inherent knowledge and output factuality.
Problem

Research questions and friction points this paper is trying to address.

Mitigate hallucinations in large language models
Enhance output factuality without extra training
Quantify factual knowledge for each token
Innovation

Methods, ideas, or system contributions that make the work stand out.

Token-wise cross-layer entropy
Mitigate hallucinations without training
Adjust predicting distribution for factuality