Beyond Confidence: The Rhythms of Reasoning in Generative Models

πŸ“… 2026-02-11
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Large language models exhibit sensitivity to input perturbations, yet conventional metrics such as perplexity often fail to capture local prediction instabilities. To address this limitation, this work proposes Token Constraint Bound (Ξ΄_TCB), a novel metric that quantifies the maximum perturbation a model’s internal state can tolerate while preserving its dominant prediction, by analyzing the geometric structure of the output embedding space. This is the first approach to characterize prediction robustness through such geometric analysis. Experimental results demonstrate that Ξ΄_TCB effectively uncovers prediction vulnerabilities in prompt engineering scenarios that are overlooked by traditional evaluation metrics, offering a new perspective for assessing and enhancing model robustness.

Technology Category

Application Category

πŸ“ Abstract
Large Language Models (LLMs) exhibit impressive capabilities yet suffer from sensitivity to slight input context variations, hampering reliability. Conventional metrics like accuracy and perplexity fail to assess local prediction robustness, as normalized output probabilities can obscure the underlying resilience of an LLM's internal state to perturbations. We introduce the Token Constraint Bound ($\delta_{\mathrm{TCB}}$), a novel metric that quantifies the maximum internal state perturbation an LLM can withstand before its dominant next-token prediction significantly changes. Intrinsically linked to output embedding space geometry, $\delta_{\mathrm{TCB}}$ provides insights into the stability of the model's internal predictive commitment. Our experiments show $\delta_{\mathrm{TCB}}$ correlates with effective prompt engineering and uncovers critical prediction instabilities missed by perplexity during in-context learning and text generation. $\delta_{\mathrm{TCB}}$ offers a principled, complementary approach to analyze and potentially improve the contextual stability of LLM predictions.
Problem

Research questions and friction points this paper is trying to address.

robustness
large language models
input sensitivity
prediction stability
contextual perturbations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Token Constraint Bound
internal state robustness
output embedding geometry
prediction stability
in-context learning
πŸ”Ž Similar Papers
No similar papers found.
D
Deyuan Liu
Harbin Institute of Technology
Zecheng Wang
Zecheng Wang
Harbin Institute of Technology
Z
Zhanyue Qin
Harbin Institute of Technology
Zhiying Tu
Zhiying Tu
Harbin Institute of Technology
software engineering
D
Dianhui Chu
Harbin Institute of Technology
Dianbo Sui
Dianbo Sui
Harbin Institute of Technology