Search More, Think Less: Rethinking Long-Horizon Agentic Search for Efficiency and Generalization

📅 2026-02-26

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

This work addresses the limitations of existing deep research agents in search-intensive tasks, which suffer from high latency, high cost, and poor cross-task generalization due to their reliance on serial reasoning. To overcome these challenges, we propose the “Search More, Think Less” (SMTL) framework, which shifts long-horizon reasoning from deep sequential processing toward parallel evidence acquisition. SMTL enables efficient context management under constrained context budgets and supports joint training on both factoid question answering and open-ended research tasks through a unified data synthesis pipeline. Optimized end-to-end via supervised fine-tuning and reinforcement learning, SMTL achieves state-of-the-art performance on BrowseComp (48.6%), GAIA (75.7%), Xbench (82.0%), and DeepResearch Bench (45.9%). Compared to Mirothinker-v1.0, it reduces reasoning steps by 70.7% on BrowseComp while simultaneously improving accuracy.

Technology Category

Application Category

📝 Abstract

Recent deep research agents primarily improve performance by scaling reasoning depth, but this leads to high inference cost and latency in search-intensive scenarios. Moreover, generalization across heterogeneous research settings remains challenging. In this work, we propose \emph{Search More, Think Less} (SMTL), a framework for long-horizon agentic search that targets both efficiency and generalization. SMTL replaces sequential reasoning with parallel evidence acquisition, enabling efficient context management under constrained context budgets. To support generalization across task types, we further introduce a unified data synthesis pipeline that constructs search tasks spanning both deterministic question answering and open-ended research scenarios with task appropriate evaluation metrics. We train an end-to-end agent using supervised fine-tuning and reinforcement learning, achieving strong and often state of the art performance across benchmarks including BrowseComp (48.6\%), GAIA (75.7\%), Xbench (82.0\%), and DeepResearch Bench (45.9\%). Compared to Mirothinker-v1.0, SMTL with maximum 100 interaction steps reduces the average number of reasoning steps on BrowseComp by 70.7\%, while improving accuracy.

Problem

Research questions and friction points this paper is trying to address.

long-horizon agentic search

reasoning efficiency

generalization

inference cost

search-intensive scenarios

Innovation

Methods, ideas, or system contributions that make the work stand out.

agentic search

parallel evidence acquisition

context efficiency