The Shape of Overthinking: Backtracking Bursts in Long Reasoning Traces

📅 2026-05-27

📈 Citations: 0

✨ Influential: 0

career value

150K/year

🤖 AI Summary

This work addresses the challenge of distinguishing effective self-correction from unproductive revision cycles in long reasoning trajectories. Analyzing 6,000 reasoning traces generated by Qwen3-8B on AIME tasks, the study systematically reveals significant differences between correct and incorrect trajectories in terms of the timing, depth, and clustering patterns of backtracking bursts. By integrating trajectory segmentation, backtracking severity assessment, normalized depth analysis, and burst structure modeling, the authors propose a prefix-aware early-exit strategy. Experiments demonstrate that the proposed burst-aware filtering mechanism substantially outperforms fixed-length truncation at shallow to moderate depths, effectively identifying recoverable corrections while suppressing instability; however, moderate-length truncation remains a strong baseline.

📝 Abstract

Reasoning models often generate long traces in which useful self-correction and unproductive revision are hard to distinguish. We study this distinction through backtracking dynamics: local reconsideration, retraction, or re-derivation inside long-form reasoning traces. On 6{,}000 Qwen3-8B AIME traces, we annotate segment-level backtrack severity and analyze event timing, normalized depth, and local burst structure. We find that early isolated repair is often compatible with correct reasoning, whereas incorrect traces more often show moderate-to-severe backtracks that persist and cluster late. Cross-corpus checks show the same qualitative asymmetry across additional model/domain pairs. Filtering analyses instantiate the signal as a prefix-causal selective early-exit policy: at shallow and intermediate depths, burst-aware filtering outperforms fixed length-based filtering while using only prefix-available features. Moderate length cutoffs remain strong completed-trace baselines, but burst-aware control provides a deployable mechanism for separating recoverable repair from likely instability.

Problem

Research questions and friction points this paper is trying to address.

backtracking

reasoning traces

self-correction

overthinking

burst structure

Innovation

Methods, ideas, or system contributions that make the work stand out.

backtracking bursts

reasoning traces

self-correction