🤖 AI Summary
Enterprises urgently require converting massive volumes of unstructured data into actionable insights, yet existing autonomous agent systems suffer from critical limitations in domain adaptability, intent alignment, and system integration. To address these challenges, we propose a human-guidable multi-agent deep research system featuring a hierarchical architecture comprising a central planning agent and specialized search agents. Our approach introduces a novel reflection-driven mechanism for knowledge-gap identification and dynamic path re-planning, enabling real-time human-in-the-loop directional guidance. The system integrates the MCP tooling ecosystem, NL2SQL translation, multi-source heterogeneous document parsing, vector-based retrieval, academic and social network mining, and streaming visualization. Evaluated on DeepResearch Bench and DeepConsult benchmarks, our method significantly outperforms state-of-the-art approaches—generating high-quality analytical reports without manual intervention—and demonstrates robust efficacy on enterprise-scale real-world datasets.
📝 Abstract
As information grows exponentially, enterprises face increasing pressure to transform unstructured data into coherent, actionable insights. While autonomous agents show promise, they often struggle with domain-specific nuances, intent alignment, and enterprise integration. We present Enterprise Deep Research (EDR), a multi-agent system that integrates (1) a Master Planning Agent for adaptive query decomposition, (2) four specialized search agents (General, Academic, GitHub, LinkedIn), (3) an extensible MCP-based tool ecosystem supporting NL2SQL, file analysis, and enterprise workflows, (4) a Visualization Agent for data-driven insights, and (5) a reflection mechanism that detects knowledge gaps and updates research direction with optional human-in-the-loop steering guidance. These components enable automated report generation, real-time streaming, and seamless enterprise deployment, as validated on internal datasets. On open-ended benchmarks including DeepResearch Bench and DeepConsult, EDR outperforms state-of-the-art agentic systems without any human steering. We release the EDR framework and benchmark trajectories to advance research on multi-agent reasoning applications.
Code at https://github.com/SalesforceAIResearch/enterprise-deep-research and Dataset at https://huggingface.co/datasets/Salesforce/EDR-200