EigentSearch-Q+: Enhancing Deep Research Agents with Structured Reasoning Tools

๐Ÿ“… 2026-04-09
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the challenges of redundant exploration and fragile evidence aggregation in deep research agents performing open-domain question answering, which stem from implicit and unstructured search behaviors. To mitigate these issues, the authors propose Q+, a novel toolkit that integrates structured reasoning mechanisms into a multi-agent browser subsystem for the first time. Q+ explicitly models query planning, search progress monitoring, and evidence extraction from long web pages, combining Anthropicโ€™s โ€œthinkโ€ paradigm with information retrieval techniques within the open-source Eigent framework. Experimental results across four benchmarks demonstrate consistent performance gains: when using GPT-4.1, GPT-5.1, and Minimax M2.5, Q+ improves weighted average accuracy by 3.0, 3.8, and 0.6 percentage points, respectively, significantly enhancing the coherence and controllability of agent-based reasoning.
๐Ÿ“ Abstract
Deep research requires reasoning over web evidence to answer open-ended questions, and it is a core capability for AI agents. Yet many deep research agents still rely on implicit, unstructured search behavior that causes redundant exploration and brittle evidence aggregation. Motivated by Anthropic's "think" tool paradigm and insights from the information-retrieval literature, we introduce Q+, a set of query and evidence processing tools that make web search more deliberate by guiding query planning, monitoring search progress, and extracting evidence from long web snapshots. We integrate Q+ into the browser sub-agent of Eigent, an open-source, production-ready multi-agent workforce for computer use, yielding EigentSearch-Q+. Across four benchmarks (SimpleQA-Verified, FRAMES, WebWalkerQA, and X-Bench DeepSearch), Q+ improves Eigent's browser agent benchmark-size-weighted average accuracy by 3.0, 3.8, and 0.6 percentage points (pp) for GPT-4.1, GPT-5.1, and Minimax M2.5 model backends, respectively. Case studies further suggest that EigentSearch-Q+ produces more coherent tool-calling trajectories by making search progress and evidence handling explicit.
Problem

Research questions and friction points this paper is trying to address.

deep research
structured reasoning
web search
evidence aggregation
AI agents
Innovation

Methods, ideas, or system contributions that make the work stand out.

structured reasoning
deep research agents
query planning
evidence extraction
tool-augmented search
๐Ÿ”Ž Similar Papers
No similar papers found.