Agentic Multi-Source Grounding for Enhanced Query Intent Understanding: A DoorDash Case Study

📅 2026-03-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of ambiguous user intent in multi-category marketplaces, where sparse contextual signals—such as queries like “Wildflower”—lead existing approaches to suffer from single-label bias or large language model hallucinations. The authors propose a decoupled agent architecture that integrates staged product catalog retrieval with autonomously triggered web search to generate an ordered set of multiple candidate intents. A configurable disambiguation layer then incorporates business logic to resolve ambiguities. This system is the first to combine agent-driven web search with catalog grounding for query understanding, enabling seamless integration of proprietary data sources and rules across arbitrary marketplaces. Deployed on DoorDash, it covers over 95% of daily search impressions, achieving an overall accuracy of 90.7%—a 13.0 percentage point improvement over non-grounded LLMs. Ablation studies show that catalog grounding, web search, and dual-intent disambiguation contribute +8.3pp, +3.2pp, and +1.5pp respectively on long-tail queries.

Technology Category

Application Category

📝 Abstract
Accurately mapping user queries to business categories is a fundamental Information Retrieval challenge for multi-category marketplaces, where context-sparse queries such as "Wildflower" exhibit intent ambiguity, simultaneously denoting a restaurant chain, a retail product, and a floral item. Traditional classifiers force a winner-takes-all assignment, while general-purpose LLMs hallucinate unavailable inventory. We introduce an Agentic Multi-Source Grounded system that addresses both failure modes by grounding LLM inference in (i) a staged catalog entity retrieval pipeline and (ii) an agentic web-search tool invoked autonomously for cold-start queries. Rather than predicting a single label, the model emits an ordered multi-intent set, resolved by a configurable disambiguation layer that applies deterministic business policies and is designed for extensibility to personalization signals. This decoupled design generalizes across domains, allowing any marketplace to supply its own grounding sources and resolution rules without modifying the core architecture. Evaluated on DoorDash's multi-vertical search platform, the system achieves +10.9pp over the ungrounded LLM baseline and +4.6pp over the legacy production system. On long-tail queries, incremental ablations attribute +8.3pp to catalog grounding, +3.2pp to agentic web search grounding, and +1.5pp to dual intent disambiguation, yielding 90.7% accuracy (+13.0pp over baseline). The system is deployed in production, serving over 95% of daily search impressions, and establishes a generalizable paradigm for applications requiring foundation models grounded in proprietary context and real-time web knowledge to resolve ambiguous, context-sparse decision problems at scale.
Problem

Research questions and friction points this paper is trying to address.

query intent ambiguity
multi-category marketplace
context-sparse queries
intent disambiguation
information retrieval
Innovation

Methods, ideas, or system contributions that make the work stand out.

agentic grounding
multi-source retrieval
query intent disambiguation
foundation model grounding
multi-intent resolution
🔎 Similar Papers
No similar papers found.
E
Emmanuel Aboah Boateng
DoorDash, Inc., San Francisco, California, USA
Kyle MacDonald
Kyle MacDonald
NLP and ML at DoorDash
DevelopmentCognitive ScienceLanguageArtificial Intelligence
A
Akshad Viswanathan
DoorDash, Inc., San Francisco, California, USA
S
Sudeep Das
DoorDash, Inc., San Francisco, California, USA