Agentic Search in the Wild: Intents and Trajectory Dynamics from 14M+ Real Search Requests

📅 2026-01-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the lack of empirical analysis on search behaviors of large language model (LLM)-driven agents in real-world multi-turn retrieval, particularly regarding conversational intent evolution and evidence utilization mechanisms. Leveraging 14.44 million real search logs, the authors segment conversations and employ LLM-assisted annotation to label intents and query reformulations, proposing a Context-Driven Term Adoption Rate (CTAR) to quantify evidence traceability. The work reveals key dynamic characteristics of authentic agent-driven search: 90% of conversations span no more than 10 turns, 89% of consecutive steps occur within one minute, and 54% of new query terms can be traced back to accumulated evidence. Furthermore, it demonstrates that intent types significantly influence patterns of exploration versus repetition, offering critical empirical insights for optimizing agent-based search systems.

Technology Category

Application Category

📝 Abstract
LLM-powered search agents are increasingly being used for multi-step information seeking tasks, yet the IR community lacks empirical understanding of how agentic search sessions unfold and how retrieved evidence is used. This paper presents a large-scale log analysis of agentic search based on 14.44M search requests (3.97M sessions) collected from DeepResearchGym, i.e. an open-source search API accessed by external agentic clients. We sessionize the logs, assign session-level intents and step-wise query-reformulation labels using LLM-based annotation, and propose Context-driven Term Adoption Rate (CTAR) to quantify whether newly introduced query terms are traceable to previously retrieved evidence. Our analyses reveal distinctive behavioral patterns. First, over 90% of multi-turn sessions contain at most ten steps, and 89% of inter-step intervals fall under one minute. Second, behavior varies by intent. Fact-seeking sessions exhibit high repetition that increases over time, while sessions requiring reasoning sustain broader exploration. Third, agents reuse evidence across steps. On average, 54% of newly introduced query terms appear in the accumulated evidence context, with contributions from earlier steps beyond the most recent retrieval. The findings suggest that agentic search may benefit from repetition-aware early stopping, intent-adaptive retrieval budgets, and explicit cross-step context tracking. We plan to release the anonymized logs to support future research.
Problem

Research questions and friction points this paper is trying to address.

agentic search
search behavior
intent analysis
evidence reuse
multi-step information seeking
Innovation

Methods, ideas, or system contributions that make the work stand out.

Agentic Search
Context-driven Term Adoption Rate (CTAR)
LLM-based Annotation
Search Trajectory Analysis
Evidence Reuse
🔎 Similar Papers
No similar papers found.
J
Jingjie Ning
Carnegie Mellon University
J
João Coelho
Carnegie Mellon University
Y
Yibo Kong
Carnegie Mellon University
Y
Yunfan Long
Carnegie Mellon University
Bruno Martins
Bruno Martins
Instituto Superior Técnico and INESC-ID, University of Lisbon
Data ScienceLanguage TechnologiesInformation RetrievalGeospatial A.I.
J
João Magalhães
NOVA LINCS, NOVA University Lisbon
J
James P. Callan
Carnegie Mellon University
Chenyan Xiong
Chenyan Xiong
Associate Professor, Carnegie Mellon University
Information RetrievalLanguage ModelsNatural Language Understanding.