π€ AI Summary
A unified theoretical framework characterizing the macroscopic generative dynamics of LLM-driven agents remains lacking; it is unknown whether universal, architecture- and prompt-agnostic physical laws govern such dynamics.
Method: We model LLM text generation via the principle of least action, integrating large-scale statistical estimation of state-transition probabilities with trajectory-level analysis, and conduct the first empirical test of detailed balance in LLM generative dynamics.
Contribution/Results: We empirically demonstrate that LLMs implicitly learn an underlying potential function governing token transitions, and that their state transitions strictly satisfy detailed balanceβa property robust across diverse models and prompts. This establishes the first quantifiable physical foundation for the macroscopic generative process of LLMs, advancing AI agent research from empirical engineering toward a measurable, predictive scientific paradigm.
π Abstract
Large language model (LLM)-driven agents are emerging as a powerful new paradigm for solving complex problems. Despite the empirical success of these practices, a theoretical framework to understand and unify their macroscopic dynamics remains lacking. This Letter proposes a method based on the least action principle to estimate the underlying generative directionality of LLMs embedded within agents. By experimentally measuring the transition probabilities between LLM-generated states, we statistically discover a detailed balance in LLM-generated transitions, indicating that LLM generation may not be achieved by generally learning rule sets and strategies, but rather by implicitly learning a class of underlying potential functions that may transcend different LLM architectures and prompt templates. To our knowledge, this is the first discovery of a macroscopic physical law in LLM generative dynamics that does not depend on specific model details. This work is an attempt to establish a macroscopic dynamics theory of complex AI systems, aiming to elevate the study of AI agents from a collection of engineering practices to a science built on effective measurements that are predictable and quantifiable.