🤖 AI Summary
This work proposes TRACED, a novel framework that addresses the limitations of scalar-probability-based evaluation methods in capturing the structural dynamics and reliability of large language model (LLM) reasoning. By introducing geometric dynamics into LLM reasoning analysis for the first time, TRACED models reasoning trajectories along two geometric dimensions—“progress” (displacement) and “stability” (curvature)—revealing fundamental topological distinctions between correct reasoning and hallucination. The framework establishes cognitive mappings between “hesitation loops” and high curvature, as well as between “certainty accumulation” and displacement, offering a new physical perspective on machine cognition. Experimental results demonstrate that TRACED achieves strong performance and robustness across multiple benchmarks, effectively discriminating between correct reasoning and hallucinatory behavior.
📝 Abstract
Evaluating LLM reliability via scalar probabilities often fails to capture the structural dynamics of reasoning. We introduce TRACED, a framework that assesses reasoning quality through theoretically grounded geometric kinematics. By decomposing reasoning traces into Progress (displacement) and Stability (curvature), we reveal a distinct topological divergence: correct reasoning manifests as high-progress, stable trajectories, whereas hallucinations are characterized by low-progress, unstable patterns (stalled displacement with high curvature fluctuations). Leveraging these signatures, our probabilistic framework achieves competitive performance and superior robustness across diverse benchmarks. Crucially, TRACED bridges geometry and cognition by mapping high curvature to''Hesitation Loops''and displacement to''Certainty Accumulation'', offering a physical lens to decode the internal dynamics of machine thought.