TACT: Mitigating Overthinking and Overacting in Coding Agents via Activation Steering

📅 2026-05-07

📈 Citations: 0

✨ Influential: 0

career value

158K/year

🤖 AI Summary

Language model agents often suffer from agent drift in complex software engineering tasks due to “overthinking” and “overacting.” This work addresses this issue by modeling drift as a manipulable direction within the residual stream and introduces TACT (Trajectory-based Activation Correction Technique). TACT identifies drift axes through trajectory annotation and dynamically projects activations onto these axes during inference, pulling them back into a calibrated region to enable real-time intervention. Experimental results demonstrate that TACT improves task success rates by 5.8 and 4.8 percentage points on Qwen3.5-27B and Gemma-4-26B-A4B-it, respectively, while reducing the number of steps required to solve tasks by up to 26%.

📝 Abstract

When language model agents tackle complex software engineering tasks, they often degrade over long trajectories, which we define as *agent drift*. We focus on two recurring failure modes *overthinking* and *overacting*, i.e., where the agent repeatedly reasons over information it already has, and where it issues tool calls without integrating recent observations or acquiring new evidence. In this paper, we introduce TACT (Think-Act Calibration via activation Steering), to detect and mitigate agent drift in the residual stream before it surfaces as a behavioral failure. In specific, we label trajectory steps as overthinking, overacting, or calibrated, and find that their hidden states can separate linearly along two *drift axes*, pointing from calibrated behavior toward each failure mode (AUC $\approx$ 0.9). To mitigate agent drift, we project each step's activation onto these axes at test time and pull drifted ones back toward the calibrated region. Experiments show that TACT outperforms unsteered baselines across SWE-bench Verified, Terminal-Bench 2.0, and CLAW-Eval, lifting average resolve rate by $+5.8$ pp on Qwen3.5-27B and $+4.8$ pp on Gemma-4-26B-A4B-it while cutting steps-to-resolve by up to $26\%$. These gains frame agent drift as a steerable direction in the residual stream, and position TACT as a viable handle for reliable long-horizon agents.

Problem

Research questions and friction points this paper is trying to address.

agent drift

overthinking

overacting

coding agents

residual stream

Innovation

Methods, ideas, or system contributions that make the work stand out.

activation steering

agent drift

overthinking