Information Locality as an Inductive Bias for Neural Language Models

📅 2025-06-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper investigates whether neural language models (LMs) exhibit inductive biases aligned with the locality constraints observed in human language processing. Method: We introduce *m-local entropy*, an information-theoretic measure quantifying the local disambiguatory power of context for predicting the next symbol, enabling controlled evaluation of model sensitivity to local statistical structure. Using average lossy-context surprisal as a modeling framework, we systematically assess Transformer and LSTM prediction behavior on perturbed natural text and synthetic languages generated by probabilistic finite-state automata. Results: Higher m-local entropy correlates with increased prediction difficulty; both architectures demonstrate human-like sensitivity to local statistics. This work is the first to formally characterize linguistic information locality and empirically link cognitive plausibility to LM generalization behavior—revealing the nature of their implicit inductive biases.

Technology Category

Application Category

📝 Abstract
Inductive biases are inherent in every machine learning system, shaping how models generalize from finite data. In the case of neural language models (LMs), debates persist as to whether these biases align with or diverge from human processing constraints. To address this issue, we propose a quantitative framework that allows for controlled investigations into the nature of these biases. Within our framework, we introduce $m$-local entropy$unicode{x2013}$an information-theoretic measure derived from average lossy-context surprisal$unicode{x2013}$that captures the local uncertainty of a language by quantifying how effectively the $m-1$ preceding symbols disambiguate the next symbol. In experiments on both perturbed natural language corpora and languages defined by probabilistic finite-state automata (PFSAs), we show that languages with higher $m$-local entropy are more difficult for Transformer and LSTM LMs to learn. These results suggest that neural LMs, much like humans, are highly sensitive to the local statistical structure of a language.
Problem

Research questions and friction points this paper is trying to address.

Investigates neural LMs' inductive biases versus human processing constraints
Proposes m-local entropy to measure language's local uncertainty
Shows higher m-local entropy makes learning harder for LMs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces m-local entropy measure
Quantifies local uncertainty in languages
Tests sensitivity of LMs to structure
🔎 Similar Papers
No similar papers found.