Rethinking the Understanding Ability across LLMs through Mutual Information

📅 2025-05-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Evaluating the intrinsic language understanding capability of large language models (LLMs) remains challenging due to its task-dependent nature and lack of objective, task-agnostic metrics. Method: We propose an information-theoretic framework grounded in mutual information (MI), formalizing understanding as the MI between input sentences and hidden-layer representations, and derive a token-level computable lower bound. Contribution/Results: This is the first work to use MI as a unified, unsupervised measure of understanding. We identify a “late-stage forgetting” phenomenon in decoder-only architectures and introduce token recoverability—a practical, MI-aligned proxy for understanding. Experiments show that encoder-only models preserve more input information, and fine-tuning to maximize recoverability significantly improves zero-shot understanding performance. The framework provides both theoretical grounding and actionable methodology for modeling and optimizing linguistic understanding in LLMs.

Technology Category

Application Category

📝 Abstract
Recent advances in large language models (LLMs) have revolutionized natural language processing, yet evaluating their intrinsic linguistic understanding remains challenging. Moving beyond specialized evaluation tasks, we propose an information-theoretic framework grounded in mutual information (MI) to achieve this. We formalize the understanding as MI between an input sentence and its latent representation (sentence-level MI), measuring how effectively input information is preserved in latent representation. Given that LLMs learn embeddings for individual tokens, we decompose sentence-level MI into token-level MI between tokens and sentence embeddings, establishing theoretical bounds connecting these measures. Based on this foundation, we theoretically derive a computable lower bound for token-level MI using Fano's inequality, which directly relates to token-level recoverability-the ability to predict original tokens from sentence embedding. We implement this recoverability task to comparatively measure MI across different LLMs, revealing that encoder-only models consistently maintain higher information fidelity than their decoder-only counterparts, with the latter exhibiting a distinctive late-layer"forgetting"pattern where mutual information is first enhanced and then discarded. Moreover, fine-tuning to maximize token-level recoverability consistently improves understanding ability of LLMs on tasks without task-specific supervision, demonstrating that mutual information can serve as a foundation for understanding and improving language model capabilities.
Problem

Research questions and friction points this paper is trying to address.

Evaluating LLMs' linguistic understanding via mutual information
Measuring token-level recoverability using Fano's inequality
Comparing information fidelity between encoder and decoder models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Using mutual information for LLM evaluation
Decomposing sentence MI into token MI
Implementing recoverability task for MI measurement
🔎 Similar Papers
No similar papers found.
S
Shaojie Wang
Department of Industrial and Systems Engineering, University of Houston
S
Sirui Ding
Bakar Computational Health Sciences Institute, UCSF
Na Zou
Na Zou
Assistant Professor, University of Houston
Shortcuts in Machine LearningInterpretable Machine LearningNetwork ModelingTransfer Learning