The Shape of Overthinking: Backtracking Bursts in Long Reasoning Traces

📅 2026-05-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of distinguishing effective self-correction from unproductive revision cycles in long reasoning trajectories. Analyzing 6,000 reasoning traces generated by Qwen3-8B on AIME tasks, the study systematically reveals significant differences between correct and incorrect trajectories in terms of the timing, depth, and clustering patterns of backtracking bursts. By integrating trajectory segmentation, backtracking severity assessment, normalized depth analysis, and burst structure modeling, the authors propose a prefix-aware early-exit strategy. Experiments demonstrate that the proposed burst-aware filtering mechanism substantially outperforms fixed-length truncation at shallow to moderate depths, effectively identifying recoverable corrections while suppressing instability; however, moderate-length truncation remains a strong baseline.
📝 Abstract
Reasoning models often generate long traces in which useful self-correction and unproductive revision are hard to distinguish. We study this distinction through backtracking dynamics: local reconsideration, retraction, or re-derivation inside long-form reasoning traces. On 6{,}000 Qwen3-8B AIME traces, we annotate segment-level backtrack severity and analyze event timing, normalized depth, and local burst structure. We find that early isolated repair is often compatible with correct reasoning, whereas incorrect traces more often show moderate-to-severe backtracks that persist and cluster late. Cross-corpus checks show the same qualitative asymmetry across additional model/domain pairs. Filtering analyses instantiate the signal as a prefix-causal selective early-exit policy: at shallow and intermediate depths, burst-aware filtering outperforms fixed length-based filtering while using only prefix-available features. Moderate length cutoffs remain strong completed-trace baselines, but burst-aware control provides a deployable mechanism for separating recoverable repair from likely instability.
Problem

Research questions and friction points this paper is trying to address.

backtracking
reasoning traces
self-correction
overthinking
burst structure
Innovation

Methods, ideas, or system contributions that make the work stand out.

backtracking bursts
reasoning traces
self-correction
early-exit policy
burst-aware filtering
🔎 Similar Papers
No similar papers found.