🤖 AI Summary
This work addresses the limitation of existing retrieval systems that overlook the rich intent and contextual information embedded in natural language reasoning traces generated by deep research agents prior to query formulation. To bridge this gap, we propose a novel reasoning-aware retrieval paradigm that jointly embeds agent-generated reasoning traces and queries. We further introduce DR-Synth, a method that synthesizes training samples from standard question-answering datasets to construct an efficient agent-oriented retrieval model. Integrating reasoning-aware embeddings, dense vector retrieval, and large language model ensembles, our approach—implemented as AgentIR-4B—achieves 68% accuracy on the BrowseComp-Plus benchmark, substantially outperforming both a traditional embedding model twice its size (50%) and BM25 (37%).
📝 Abstract
Deep Research agents are rapidly emerging as primary consumers of modern retrieval systems. Unlike human users who issue and refine queries without documenting their intermediate thought processes, Deep Research agents generate explicit natural language reasoning before each search call, revealing rich intent and contextual information that existing retrievers entirely ignore. To exploit this overlooked signal, we introduce: (1) Reasoning-Aware Retrieval, a retrieval paradigm that jointly embeds the agent's reasoning trace alongside its query; and (2) DR-Synth, a data synthesis method that generates Deep Research retriever training data from standard QA datasets. We demonstrate that both components are independently effective, and their combination yields a trained embedding model, AgentIR-4B, with substantial gains. On the challenging BrowseComp-Plus benchmark, AgentIR-4B achieves 68\% accuracy with the open-weight agent Tongyi-DeepResearch, compared to 50\% with conventional embedding models twice its size, and 37\% with BM25. Code and data are available at: https://texttron.github.io/AgentIR/.