Agentic Search in the Wild: Intents and Trajectory Dynamics from 14M+ Real Search Requests

📅 2026-01-24

📈 Citations: 0

✨ Influential: 0

career value

222K/year

🤖 AI Summary

This study addresses the lack of empirical analysis on search behaviors of large language model (LLM)-driven agents in real-world multi-turn retrieval, particularly regarding conversational intent evolution and evidence utilization mechanisms. Leveraging 14.44 million real search logs, the authors segment conversations and employ LLM-assisted annotation to label intents and query reformulations, proposing a Context-Driven Term Adoption Rate (CTAR) to quantify evidence traceability. The work reveals key dynamic characteristics of authentic agent-driven search: 90% of conversations span no more than 10 turns, 89% of consecutive steps occur within one minute, and 54% of new query terms can be traced back to accumulated evidence. Furthermore, it demonstrates that intent types significantly influence patterns of exploration versus repetition, offering critical empirical insights for optimizing agent-based search systems.

Technology Category

Application Category

📝 Abstract

LLM-powered search agents are increasingly being used for multi-step information seeking tasks, yet the IR community lacks empirical understanding of how agentic search sessions unfold and how retrieved evidence is used. This paper presents a large-scale log analysis of agentic search based on 14.44M search requests (3.97M sessions) collected from DeepResearchGym, i.e. an open-source search API accessed by external agentic clients. We sessionize the logs, assign session-level intents and step-wise query-reformulation labels using LLM-based annotation, and propose Context-driven Term Adoption Rate (CTAR) to quantify whether newly introduced query terms are traceable to previously retrieved evidence. Our analyses reveal distinctive behavioral patterns. First, over 90% of multi-turn sessions contain at most ten steps, and 89% of inter-step intervals fall under one minute. Second, behavior varies by intent. Fact-seeking sessions exhibit high repetition that increases over time, while sessions requiring reasoning sustain broader exploration. Third, agents reuse evidence across steps. On average, 54% of newly introduced query terms appear in the accumulated evidence context, with contributions from earlier steps beyond the most recent retrieval. The findings suggest that agentic search may benefit from repetition-aware early stopping, intent-adaptive retrieval budgets, and explicit cross-step context tracking. We plan to release the anonymized logs to support future research.

Problem

Research questions and friction points this paper is trying to address.

agentic search

search behavior

intent analysis

evidence reuse

multi-step information seeking

Innovation

Methods, ideas, or system contributions that make the work stand out.

Agentic Search

Context-driven Term Adoption Rate (CTAR)

LLM-based Annotation