From Growing to Looping: A Unified View of Iterative Computation in LLMs

📅 2026-02-18

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

This work investigates the intrinsic mechanisms by which depth scaling and recurrent architectures enhance the reasoning capabilities of large language models. By analyzing the shared computational patterns across layers in both approaches, the study establishes a unified perspective, revealing that both rely on iterative computation to strengthen reasoning. It further proposes a complementary strategy that combines these methods at inference time without requiring additional training. The approach integrates depth-scaled training, inference-time recurrence, activation pattern analysis, and fine-tuning on high-quality mathematical data. Experimental results demonstrate that this combination can improve accuracy by up to twofold on certain reasoning tasks; depth-scaled models achieve state-of-the-art performance on mathematical benchmarks, and incorporating intermediate block recurrence further enhances their efficacy.

Technology Category

Application Category

📝 Abstract

Looping, reusing a block of layers across depth, and depth growing, training shallow-to-deep models by duplicating middle layers, have both been linked to stronger reasoning, but their relationship remains unclear. We provide a mechanistic unification: looped and depth-grown models exhibit convergent depth-wise signatures, including increased reliance on late layers and recurring patterns aligned with the looped or grown block. These shared signatures support the view that their gains stem from a common form of iterative computation. Building on this connection, we show that the two techniques are adaptable and composable: applying inference-time looping to the middle blocks of a depth-grown model improves accuracy on some reasoning primitives by up to $2\times$, despite the model never being trained to loop. Both approaches also adapt better than the baseline when given more in-context examples or additional supervised fine-tuning data. Additionally, depth-grown models achieve the largest reasoning gains when using higher-quality, math-heavy cooldown mixtures, which can be further boosted by adapting a middle block to loop. Overall, our results position depth growth and looping as complementary, practical methods for inducing and scaling iterative computation to improve reasoning.

Problem

Research questions and friction points this paper is trying to address.

looping

depth growing

iterative computation

reasoning

large language models

Innovation

Methods, ideas, or system contributions that make the work stand out.

iterative computation

looping

depth growing