What do Language Model Probabilities Represent? From Distribution Estimation to Response Prediction

📅 2025-05-04

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Prior work conflates three distinct probabilistic objectives in large language models (LLMs)—distribution estimation (pretraining), conditional response prediction (in-context learning), and preference alignment (preference tuning)—despite their fundamentally incompatible semantic interpretations. This conflation has led to widespread misinterpretation of model outputs and hindered rigorous evaluation. Method: The paper establishes a formal probabilistic framework that rigorously distinguishes these objectives via distributional semantics, identifying their inherent inconsistencies and tracing common methodological pitfalls to ambiguous probability interpretations. Contribution/Results: It provides the first unified theoretical foundation for interpreting LLM behaviors across training paradigms, enabling principled analysis of model calibration, trustworthy reasoning assessment, and probabilistic interface design. By clarifying objective-specific distributional targets, the framework shifts LLM research from heuristic tuning and phenomenological observation toward goal-directed, distributionally grounded modeling and evaluation.

Technology Category

Application Category

📝 Abstract

The notion of language modeling has gradually shifted in recent years from a distribution over finite-length strings to general-purpose prediction models for textual inputs and outputs, following appropriate alignment phases. This paper analyzes the distinction between distribution estimation and response prediction in the context of LLMs, and their often conflicting goals. We examine the training phases of LLMs, which include pretraining, in-context learning, and preference tuning, and also the common use cases for their output probabilities, which include completion probabilities and explicit probabilities as output. We argue that the different settings lead to three distinct intended output distributions. We demonstrate that NLP works often assume that these distributions should be similar, which leads to misinterpretations of their experimental findings. Our work sets firmer formal foundations for the interpretation of LLMs, which will inform ongoing work on the interpretation and use of LLMs' induced distributions.

Problem

Research questions and friction points this paper is trying to address.

Distinguishing distribution estimation vs response prediction in LLMs

Analyzing conflicting goals in LLM training phases

Clarifying misinterpretations of LLM output probabilities

Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzes distribution estimation vs response prediction

Examines pretraining and tuning phases of LLMs

Clarifies three distinct intended output distributions

🔎 Similar Papers

No similar papers found.

Authors to Follow