🤖 AI Summary
It remains unclear whether large language models (LLMs) genuinely internalize formal logical rules or merely exploit superficial semantic correlations.
Method: We model logical reasoning as geometric flow in representation space, introducing the first framework that unifies natural deduction systems with differential geometric dynamics—treating logical propositions as controllers that modulate the local velocity of embedding trajectories. By decoupling logical structure from semantic content, we design controlled experiments and formally characterize inference paths using geometric quantities—position, velocity, and curvature—for both modeling and visualization.
Results: Empirical analysis reveals that LLM inference traces smooth, quantifiable, logic-constrained manifold trajectories in representation space. Our framework not only provides a geometric characterization of reasoning paths but also enables explicit, continuous control over inference dynamics. Crucially, it establishes the first computationally tractable and interpretable geometric benchmark for rigorously evaluating the formal reasoning capabilities of LLMs.
📝 Abstract
We study how large language models (LLMs) ``think'' through their representation space. We propose a novel geometric framework that models an LLM's reasoning as flows -- embedding trajectories evolving where logic goes. We disentangle logical structure from semantics by employing the same natural deduction propositions with varied semantic carriers, allowing us to test whether LLMs internalize logic beyond surface form. This perspective connects reasoning with geometric quantities such as position, velocity, and curvature, enabling formal analysis in representation and concept spaces. Our theory establishes: (1) LLM reasoning corresponds to smooth flows in representation space, and (2) logical statements act as local controllers of these flows' velocities. Using learned representation proxies, we design controlled experiments to visualize and quantify reasoning flows, providing empirical validation of our theoretical framework. Our work serves as both a conceptual foundation and practical tools for studying reasoning phenomenon, offering a new lens for interpretability and formal analysis of LLMs' behavior.