Large Language Model probabilities cannot distinguish between possible and impossible language

📅 2025-09-18

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

This study investigates whether large language models (LLMs) can reliably distinguish grammatical from ungrammatical sentences using token-level surprisal—a common proxy for syntactic knowledge—thereby challenging the default assumption that generation probabilities directly reflect grammatical competence. We introduce a minimal-pair surprisal-difference analysis and construct a novel benchmark targeting syntactic, semantic, and pragmatic anomalies under controlled conditions. Four state-of-the-art LLMs are systematically evaluated on their surprisal responses to these stimuli. Results show no consistent increase in surprisal for ungrammatical sentences; instead, semantic and pragmatic violations elicit significantly higher surprisal than syntactic ones. This indicates that surface-level probability outputs do not robustly encode or expose LLMs’ implicit syntactic knowledge boundaries. Our work provides both a methodological advance—via minimal-pair surprisal contrast—and empirical evidence that challenges probabilistic proxies for grammaticality, offering a refined framework for evaluating linguistic competence in LLMs.

Technology Category

Application Category

📝 Abstract

A controversial test for Large Language Models concerns the ability to discern possible from impossible language. While some evidence attests to the models' sensitivity to what crosses the limits of grammatically impossible language, this evidence has been contested on the grounds of the soundness of the testing material. We use model-internal representations to tap directly into the way Large Language Models represent the 'grammatical-ungrammatical' distinction. In a novel benchmark, we elicit probabilities from 4 models and compute minimal-pair surprisal differences, juxtaposing probabilities assigned to grammatical sentences to probabilities assigned to (i) lower frequency grammatical sentences, (ii) ungrammatical sentences, (iii) semantically odd sentences, and (iv) pragmatically odd sentences. The prediction is that if string-probabilities can function as proxies for the limits of grammar, the ungrammatical condition will stand out among the conditions that involve linguistic violations, showing a spike in the surprisal rates. Our results do not reveal a unique surprisal signature for ungrammatical prompts, as the semantically and pragmatically odd conditions consistently show higher surprisal. We thus demonstrate that probabilities do not constitute reliable proxies for model-internal representations of syntactic knowledge. Consequently, claims about models being able to distinguish possible from impossible language need verification through a different methodology.

Problem

Research questions and friction points this paper is trying to address.

Assessing if LLM probabilities distinguish grammatical from ungrammatical sentences

Testing whether surprisal rates uniquely identify syntactic violations in language

Evaluating reliability of probabilities as proxies for internal grammatical knowledge

Innovation

Methods, ideas, or system contributions that make the work stand out.

Using model-internal representations to analyze grammatical distinctions

Employing minimal-pair surprisal differences across multiple violation types

Testing probabilities against syntactic, semantic, and pragmatic anomalies

🔎 Similar Papers

Hallucination is Inevitable: An Innate Limitation of Large Language Models