To Search or Not to Search: Aligning the Decision Boundary of Deep Search Agents via Causal Intervention

📅 2026-02-03

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

This work addresses the challenge that deep search agents often suffer from inefficient or inaccurate performance due to difficulty in determining the optimal stopping point, leading to either excessive or insufficient search. To tackle this issue, the paper introduces causal intervention into the calibration of decision boundaries for the first time, proposing the Decision-Aligned Search (DAS) framework. DAS generates preference data by contrasting factual and counterfactual search trajectories and leverages preference optimization to jointly align both the search process and its outcomes. Evaluated on public benchmarks, the method significantly mitigates suboptimal search behavior, simultaneously improving answer accuracy and search efficiency.

Technology Category

Application Category

📝 Abstract

Deep search agents, which autonomously iterate through multi-turn web-based reasoning, represent a promising paradigm for complex information-seeking tasks. However, current agents suffer from critical inefficiency: they conduct excessive searches as they cannot accurately judge when to stop searching and start answering. This stems from outcome-centric training that prioritize final results over the search process itself. We identify the root cause as misaligned decision boundaries, the threshold determining when accumulated information suffices to answer. This causes over-search (redundant searching despite sufficient knowledge) and under-search (premature termination yielding incorrect answers). To address these errors, we propose a comprehensive framework comprising two key components. First, we introduce causal intervention-based diagnosis that identifies boundary errors by comparing factual and counterfactual trajectories at each decision point. Second, we develop Decision Boundary Alignment for Deep Search agents (DAS), which constructs preference datasets from causal feedback and aligns policies via preference optimization. Experiments on public datasets demonstrate that decision boundary errors are pervasive across state-of-the-art agents. Our DAS method effectively calibrates these boundaries, mitigating both over-search and under-search to achieve substantial gains in accuracy and efficiency. Our code and data are publicly available at: https://github.com/Applied-Machine-Learning-Lab/WWW2026_DAS.

Problem

Research questions and friction points this paper is trying to address.

deep search agents

decision boundary

over-search

under-search

information-seeking tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

causal intervention

decision boundary alignment

deep search agents