Orchestrating Specialized Agents for Trustworthy Enterprise RAG

📅 2026-01-26

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

Enterprise-grade RAG systems in high-stakes decision-making are often hindered by shallow retrieval, lack of traceability, and fragility to ambiguous queries. To address these limitations, this work proposes ADORE, a framework that orchestrates multiple specialized agents through a central coordinator to perform user-guided, iterative deep retrieval and synthesis. Key innovations include a structured memory repository based on Claim-Evidence Graphs, a memory-locking synthesis mechanism, an evidence-coverage-guided execution pipeline, segmented packing and compression for long-context handling, and an evidence-driven termination criterion—collectively enabling the generation of verifiable, fully traceable reports. ADORE achieves state-of-the-art performance with a score of 52.65 on the DeepResearch Bench and outperforms existing commercial systems by a 77.2% preference win rate in the DeepConsult evaluation.

Technology Category

Application Category

📝 Abstract

Retrieval-Augmented Generation (RAG) shows promise for enterprise knowledge work, yet it often underperforms in high-stakes decision settings that require deep synthesis, strict traceability, and recovery from underspecified prompts. One-pass retrieval-and-write pipelines frequently yield shallow summaries, inconsistent grounding, and weak mechanisms for completeness verification. We introduce ADORE (Adaptive Deep Orchestration for Research in Enterprise), an agentic framework that replaces linear retrieval with iterative, user-steered investigation coordinated by a central orchestrator and a set of specialized agents. ADORE's key insight is that a structured Memory Bank (a curated evidence store with explicit claim-evidence linkage and section-level admissible evidence) enables traceable report generation and systematic checks for evidence completeness. Our contributions are threefold: (1) Memory-locked synthesis - report generation is constrained to a structured Memory Bank (Claim-Evidence Graph) with section-level admissible evidence, enabling traceable claims and grounded citations; (2) Evidence-coverage-guided execution - a retrieval-reflection loop audits section-level evidence coverage to trigger targeted follow-up retrieval and terminates via an evidence-driven stopping criterion; (3) Section-packed long-context grounding - section-level packing, pruning, and citation-preserving compression make long-form synthesis feasible under context limits. Across our evaluation suite, ADORE ranks first on DeepResearch Bench (52.65) and achieves the highest head-to-head preference win rate on DeepConsult (77.2%) against commercial systems.

Problem

Research questions and friction points this paper is trying to address.

Retrieval-Augmented Generation

enterprise RAG

traceability

evidence completeness

high-stakes decision-making

Innovation

Methods, ideas, or system contributions that make the work stand out.

Memory Bank

Claim-Evidence Graph

Evidence Coverage