Step-resolved data attribution for looped transformers

📅 2026-02-10

📈 Citations: 1

✨ Influential: 0

career value

176K/year

🤖 AI Summary

This work addresses the challenge of characterizing the influence of individual training samples across the iterative steps of recurrent Transformers, a capability lacking in existing data influence estimation methods. To this end, the authors propose Step-Decomposed Influence (SDI), which unfolds the recurrent computation graph to decompose data influence at each inference step. By integrating the TracIn framework with TensorSketch approximation, SDI avoids explicit per-sample gradient computation, enabling efficient and scalable fine-grained attribution. Experiments demonstrate that SDI achieves high accuracy and strong scalability on recurrent GPT models and algorithmic reasoning tasks, facilitating multi-dimensional interpretability analyses of the internal reasoning dynamics within recurrent architectures.

Technology Category

Application Category

📝 Abstract

We study how individual training examples shape the internal computation of looped transformers, where a shared block is applied for $\tau$ recurrent iterations to enable latent reasoning. Existing training-data influence estimators such as TracIn yield a single scalar score that aggregates over all loop iterations, obscuring when during the recurrent computation a training example matters. We introduce \textit{Step-Decomposed Influence (SDI)}, which decomposes TracIn into a length-$\tau$ influence trajectory by unrolling the recurrent computation graph and attributing influence to specific loop iterations. To make SDI practical at transformer scale, we propose a TensorSketch implementation that never materialises per-example gradients. Experiments on looped GPT-style models and algorithmic reasoning tasks show that SDI scales excellently, matches full-gradient baselines with low error and supports a broad range of data attribution and interpretability tasks with per-step insights into the latent reasoning process.

Problem

Research questions and friction points this paper is trying to address.

data attribution

looped transformers

influence estimation

recurrent computation

interpretability

Innovation

Methods, ideas, or system contributions that make the work stand out.

Step-Decomposed Influence

looped transformers

data attribution