The Detection--Extraction Gap: Models Know the Answer Before They Can Say It

📅 2026-04-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the inefficiency of large language models (LLMs) in reasoning tasks, where redundant generation often degrades performance and may overwrite correct answers already present in internal states. The study formally characterizes this issue as the “detect–extract gap”: while correct answers emerge early in the model’s hidden representations, prompt constraints hinder their timely extraction. By quantifying the distributional divergence—measured via total variation distance—between free-form continuation and forced-answer extraction, the authors propose a black-box adaptive early-exit mechanism (BAEE). Evaluated across multiple models and benchmarks, BAEE reduces average generation length by 70–78% while improving accuracy by 1–5 percentage points (up to +5.8% for chain-of-thought models). A cost-optimized variant achieves 68–73% lower computational overhead with a median of only nine API calls.
📝 Abstract
Modern reasoning models continue generating long after the answer is already determined. Across five model configurations, two families, and three benchmarks, we find that \textbf{52--88\% of chain-of-thought tokens are produced after the answer is recoverable} from a partial prefix. This post-commitment generation reveals a structural phenomenon: the \textbf{detection--extraction gap}. Free continuations from early prefixes recover the correct answer even at 10\% of the trace, while forced extraction fails on 42\% of these cases. The answer is recoverable from the model state, yet prompt-conditioned decoding fails to extract it. We formalize this mismatch via a total-variation bound between free and forced continuation distributions, yielding quantitative estimates of suffix-induced shift. Exploiting this asymmetry, we propose Black-box Adaptive Early Exit (\BAEE{}), which uses free continuations for both detection and extraction, truncating \textbf{70--78\% of serial generation} while \textbf{improving accuracy by 1--5\,pp} across all models. For thinking-mode models, early exit prevents post-commitment overwriting, yielding gains of up to 5.8\,pp; a cost-optimized variant achieves 68--73\% reduction at a median of 9 API calls. Code is available at https://github.com/EdWangLoDaSc/know2say.
Problem

Research questions and friction points this paper is trying to address.

detection-extraction gap
chain-of-thought
early exit
reasoning models
answer recoverability
Innovation

Methods, ideas, or system contributions that make the work stand out.

detection–extraction gap
early exit
chain-of-thought reasoning
free continuation
answer recoverability
🔎 Similar Papers
No similar papers found.