Rethinking the Understanding Ability across LLMs through Mutual Information

📅 2025-05-25

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

Evaluating the intrinsic language understanding capability of large language models (LLMs) remains challenging due to its task-dependent nature and lack of objective, task-agnostic metrics. Method: We propose an information-theoretic framework grounded in mutual information (MI), formalizing understanding as the MI between input sentences and hidden-layer representations, and derive a token-level computable lower bound. Contribution/Results: This is the first work to use MI as a unified, unsupervised measure of understanding. We identify a “late-stage forgetting” phenomenon in decoder-only architectures and introduce token recoverability—a practical, MI-aligned proxy for understanding. Experiments show that encoder-only models preserve more input information, and fine-tuning to maximize recoverability significantly improves zero-shot understanding performance. The framework provides both theoretical grounding and actionable methodology for modeling and optimizing linguistic understanding in LLMs.

Technology Category

Application Category

📝 Abstract

Recent advances in large language models (LLMs) have revolutionized natural language processing, yet evaluating their intrinsic linguistic understanding remains challenging. Moving beyond specialized evaluation tasks, we propose an information-theoretic framework grounded in mutual information (MI) to achieve this. We formalize the understanding as MI between an input sentence and its latent representation (sentence-level MI), measuring how effectively input information is preserved in latent representation. Given that LLMs learn embeddings for individual tokens, we decompose sentence-level MI into token-level MI between tokens and sentence embeddings, establishing theoretical bounds connecting these measures. Based on this foundation, we theoretically derive a computable lower bound for token-level MI using Fano's inequality, which directly relates to token-level recoverability-the ability to predict original tokens from sentence embedding. We implement this recoverability task to comparatively measure MI across different LLMs, revealing that encoder-only models consistently maintain higher information fidelity than their decoder-only counterparts, with the latter exhibiting a distinctive late-layer"forgetting"pattern where mutual information is first enhanced and then discarded. Moreover, fine-tuning to maximize token-level recoverability consistently improves understanding ability of LLMs on tasks without task-specific supervision, demonstrating that mutual information can serve as a foundation for understanding and improving language model capabilities.

Problem

Research questions and friction points this paper is trying to address.

Evaluating LLMs' linguistic understanding via mutual information

Measuring token-level recoverability using Fano's inequality

Comparing information fidelity between encoder and decoder models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Using mutual information for LLM evaluation

Decomposing sentence MI into token MI

Implementing recoverability task for MI measurement

🔎 Similar Papers

No similar papers found.