Deep Research: A Systematic Survey

📅 2025-11-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) exhibit significant limitations in open-ended research tasks requiring multi-source verification and critical reasoning; single-turn prompting or standard retrieval-augmented generation (RAG) proves inadequate. Method: This work formally defines “Deep Research” as a three-stage paradigm—query planning, information acquisition, and answer generation—and proposes a four-component framework encompassing these stages plus memory management. It introduces a fine-grained taxonomy, integrates evaluation criteria and open challenges into a dynamically updatable research map, and employs prompt engineering, supervised fine-tuning, and agent-based reinforcement learning to optimize LLM–tool collaboration (e.g., with search engines). Contribution/Results: We deliver the first systematic architecture for Deep Research, establish key technical pathways—including query decomposition, iterative evidence synthesis, and stateful reasoning—and provide an open benchmark and living research map to guide future development.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) have rapidly evolved from text generators into powerful problem solvers. Yet, many open tasks demand critical thinking, multi-source, and verifiable outputs, which are beyond single-shot prompting or standard retrieval-augmented generation. Recently, numerous studies have explored Deep Research (DR), which aims to combine the reasoning capabilities of LLMs with external tools, such as search engines, thereby empowering LLMs to act as research agents capable of completing complex, open-ended tasks. This survey presents a comprehensive and systematic overview of deep research systems, including a clear roadmap, foundational components, practical implementation techniques, important challenges, and future directions. Specifically, our main contributions are as follows: (i) we formalize a three-stage roadmap and distinguish deep research from related paradigms; (ii) we introduce four key components: query planning, information acquisition, memory management, and answer generation, each paired with fine-grained sub-taxonomies; (iii) we summarize optimization techniques, including prompting, supervised fine-tuning, and agentic reinforcement learning; and (iv) we consolidate evaluation criteria and open challenges, aiming to guide and facilitate future development. As the field of deep research continues to evolve rapidly, we are committed to continuously updating this survey to reflect the latest progress in this area.
Problem

Research questions and friction points this paper is trying to address.

Enhancing LLMs for complex tasks requiring critical thinking and verifiable outputs
Integrating external tools like search engines to empower LLMs as research agents
Providing a systematic survey on deep research systems, components, and challenges
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates LLMs with external tools for complex tasks
Uses multi-stage roadmap with four key components
Applies optimization techniques like prompting and fine-tuning
Zhengliang Shi
Zhengliang Shi
Shandong University
Natural Language ProcessingLLM AgentKnowledge Discovery
Yiqun Chen
Yiqun Chen
Renmin University of China
Information RetrievalRetrieval-Augmented GenerationReinforcement LearningMulti-Agent Systems
H
Haitao Li
Tsinghua University
W
Weiwei Sun
Carnegie Mellon University
S
Shiyu Ni
UCAS
Yougang Lyu
Yougang Lyu
University of Amsterdam
Natural Language ProcessingLarge Language ModelsInformation Retrieval
Run-Ze Fan
Run-Ze Fan
University of Massachusetts Amherst
LLMData EngineeringReasoning
Bowen Jin
Bowen Jin
University of Illinois, Urbana Champaign
large language modelsagentsRL
Yixuan Weng
Yixuan Weng
Westlake University
Minjun Zhu
Minjun Zhu
Westlake University; CASIA
Natural Language Processing
Q
Qiujie Xie
Westlake University
Xinyu Guo
Xinyu Guo
Samsung Research America
AIcomputer visionmachine learningmedical image analysis
Qu Yang
Qu Yang
National University of Singapore
Deep LearningSpiking Neural NetworkNeuromprphic Computing
J
Jiayi Wu
Tencent
Jujia Zhao
Jujia Zhao
Leiden University
Information RetreivalRecommendation
Xiaqiang Tang
Xiaqiang Tang
HKUST(GZ)
LLMRAGTrustworthy AI
Xinbei Ma
Xinbei Ma
Shanghai Jiao Tong University
Cunxiang Wang
Cunxiang Wang
Tsinghua University; ZhipuAI
Large Language ModelsLLM EvaluationLLM Post-training
Jiaxin Mao
Jiaxin Mao
Renmin University of China
Information RetrievalUser Behavior AnalysisData Mining and Machine Learning
Qingyao Ai
Qingyao Ai
Associate Professor, Dept. of CS&T, Tsinghua University
Information RetrievalMachine Learning
Jen-Tse Huang
Jen-Tse Huang
Johns Hopkins University
Artificial IntelligenceNatural Language ProcessingLarge Language Models
W
Wenxuan Wang
Renmin University of China
Y
Yue Zhang
Westlake University
Y
Yiming Yang
Carnegie Mellon University
Zhaopeng Tu
Zhaopeng Tu
Tech Lead @ Tencent Digital Human
Digital HumanAgentsLarge Language ModelsMachine Translation
Zhaochun Ren
Zhaochun Ren
Leiden University
Information retrievalNatural language processing