π€ AI Summary
Existing tools struggle to resolve the cognitive dynamics of large language models (LLMs) during inference at high resolution. This work proposes HyperLens, a fine-grained confidence trajectory tracking method that leverages the intrinsic amplification mechanism within deep Transformer layers to detect subtle shifts in model confidence. HyperLens reveals, for the first time, a stable divergence pattern between complex and simple tasks along these trajectories, enabling the definition of a quantifiable metric for cognitive effort. Experiments demonstrate that cognitive effort strongly correlates with task complexity, and that standard supervised fine-tuning significantly reduces cognitive effort while impairing in-domain performance, thereby highlighting its detrimental impact on the modelβs underlying cognitive behavior.
π Abstract
While Large Language Models (LLMs) achieve strong performance across diverse tasks, their inference dynamics remain poorly understood because of the limited resolution of existing analysis tools. In this work, we identify an intrinsic magnification mechanism in transformer architectures: deeper layers inherently magnify the small changes of layer-wise confidence, providing a fine-grained confidence trajectory. Building on this insight, we introduce HyperLens, a high-resolution probe designed to trace confidence trajectories and quantify the cognitive effort during inference. Across LLMs and datasets, HyperLens reveals a consistent divergence in confidence trajectories that separates complex from simple tasks. We abstract this pattern into a quantitative cognitive effort metric. Our analysis reveals a fundamental principle: complex tasks consistently require higher cognitive effort. Finally, we provide a mechanistic diagnosis of a common side effect of standard Supervised Fine-Tuning (SFT): it can reduce cognitive effort and consequently degrade performance on in-domain tasks.