On the Self-awareness of Large Reasoning Models' Capability Boundaries

📅 2025-09-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large reasoning models (LRMs) frequently suffer from unproductive chain-of-thought reasoning—persisting until context exhaustion—yielding incorrect answers and wasted computation, primarily due to their lack of self-awareness regarding capability boundaries. This work identifies, for the first time, that LRMs’ dynamic reasoning confidence and the linear separability of their final-layer hidden states strongly correlate with their intrinsic reasoning limits. Building on this insight, we propose a novel “boundary-aware” paradigm that jointly leverages black-box confidence trajectory analysis and white-box hidden-state discriminability to enable real-time capability assessment and early termination of reasoning. Experiments across diverse reasoning benchmarks demonstrate that our method reduces token consumption by 62.7%–93.6% while preserving original accuracy, significantly enhancing both reliability and efficiency of LRM inference. This establishes a principled foundation for trustworthy, resource-aware reasoning.

Technology Category

Application Category

📝 Abstract
Large Reasoning Models (LRMs) have shown impressive performance on complex reasoning tasks such as mathematics, yet they also display misbehaviors that expose their limitations. In particular, when faced with hard questions, LRMs often engage in unproductive reasoning until context limit, producing wrong answers while wasting substantial computation. This phenomenon reflects a fundamental issue: current answering paradigms overlook the relationship between questions and LRMs' capability boundaries. In this paper, we investigate whether LRMs possess self-awareness of capability boundaries. We begin by an observation that LRMs may know what they cannot solve through expressed reasoning confidence. For black-box models, we find that reasoning expressions reveal boundary signals, with accelerated growing confidence trajectory for solvable problems but convergent uncertainty trajectory for unsolvable ones. For white-box models, we show that hidden states of the last input token encode boundary information, with solvable and unsolvable problems linearly separable even before reasoning begins. Building on these findings, we propose two simple yet effective optimization strategies: reasoning expression monitoring and hidden states monitoring. Experiments demonstrate that these boundary-aware strategies enable LRMs to avoid unproductive reasoning without sacrificing accuracy, significantly improving reliability and efficiency by cutting token usage up to 62.7 - 93.6%.
Problem

Research questions and friction points this paper is trying to address.

Investigating self-awareness of Large Reasoning Models' capability boundaries
Addressing unproductive reasoning on unsolvable problems with boundary signals
Proposing monitoring strategies to improve reliability and efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Monitoring reasoning expressions to detect capability boundaries
Analyzing hidden states for linear separability of problems
Using boundary-aware strategies to reduce token usage
🔎 Similar Papers
No similar papers found.
Q
Qingjie Zhang
Tsinghua University
Yujia Fu
Yujia Fu
Beijing University of Posts and Telecommunications
LLMs NLP
Y
Yang Wang
Ant Group
Liu Yan
Liu Yan
Researcher and Director of Ant Group
AI for securitydata security
T
Tao Wei
Ant Group
K
Ke Xu
Tsinghua University
M
Minlie Huang
Tsinghua University
Han Qiu
Han Qiu
NTU