Language Models Are Implicitly Continuous

📅 2025-04-04

📈 Citations: 0

✨ Influential: 0

career value

149K/year

🤖 AI Summary

This work addresses the fundamental tension between discrete symbolic representations and continuous neural computation in language modeling. Method: We propose and empirically validate that mainstream Transformer-based large language models (e.g., Llama2/3, Phi3, Gemma, Mistral) implicitly encode discrete text inputs as smooth functions defined over a continuous time domain. We formally prove—within the Transformer architecture—that such continuous spatiotemporal representations are inherently supported, and we develop an interpretable, quantitative metric for implicit continuity. Contribution/Results: Leveraging continuous function approximation theory and input/output space visualization, we demonstrate the universality of this phenomenon across six major LLMs. Our findings challenge the conventional discrete-sequence paradigm of language modeling and introduce a novel theoretical perspective: LLMs may represent and process language via non-symbolic, continuous dynamical systems—rather than emulating human-like discrete symbol manipulation.

Technology Category

Application Category

📝 Abstract

Language is typically modelled with discrete sequences. However, the most successful approaches to language modelling, namely neural networks, are continuous and smooth function approximators. In this work, we show that Transformer-based language models implicitly learn to represent sentences as continuous-time functions defined over a continuous input space. This phenomenon occurs in most state-of-the-art Large Language Models (LLMs), including Llama2, Llama3, Phi3, Gemma, Gemma2, and Mistral, and suggests that LLMs reason about language in ways that fundamentally differ from humans. Our work formally extends Transformers to capture the nuances of time and space continuity in both input and output space. Our results challenge the traditional interpretation of how LLMs understand language, with several linguistic and engineering implications.

Problem

Research questions and friction points this paper is trying to address.

Study how Transformer models represent sentences as continuous functions

Explore LLMs' reasoning differences from human language processing

Extend Transformers to model time and space continuity nuances

Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformers model sentences as continuous-time functions

LLMs represent language in continuous input space

Extends Transformers for time and space continuity

🔎 Similar Papers

No similar papers found.