Super Research: Answering Highly Complex Questions with Large Language Models through Super Deep and Super Wide Research

๐Ÿ“… 2026-02-28
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

216K/year
๐Ÿค– AI Summary
This work addresses the limitations of large language models in tackling highly complex questions that demand long-horizon planning and integration of massive, heterogeneous evidence. To overcome these challenges, the authors propose an autonomous complex question-answering framework that synergistically combines โ€œultra-wideโ€ and โ€œultra-deepโ€ research paradigms. The approach employs structured task decomposition, large-scale multi-source retrieval, iterative deep querying, and a graph-anchored auditing protocol to produce verifiable research reports enriched with fine-grained citations and intermediate reasoning artifacts. Evaluated on a benchmark of 300 expert-level complex questions, the system demonstrates the capability to analyze thousands of web pages and perform hundred-step reasoning chains, substantially enhancing answer traceability and multidimensional credibility.

Technology Category

Application Category

๐Ÿ“ Abstract
While Large Language Models (LLMs) have demonstrated proficiency in Deep Research or Wide Search, their capacity to solve highly complex questions-those requiring long-horizon planning, massive evidence gathering, and synthesis across heterogeneous sources-remains largely unexplored. We introduce Super Research, a task for complex autonomous research tasks that integrates (i) structured decomposition into a research plan, (ii) super wide retrieval for diverse perspectives, and (iii) super deep investigation to resolve uncertainties through iterative queries. To evaluate this capability, we curated a benchmark of 300 expert-written questions across diverse domains, each requiring up to 100+ retrieval steps and 1,000+ web pages to reconcile conflicting evidence. Super Research produces verifiable reports with fine-grained citations and intermediate artifacts (e.g., outlines and tables) to ensure traceable reasoning. Furthermore, we present a graph-anchored auditing protocol that evaluates Super Research along five dimensions: Coverage, Logical Consistency, Report Utility, Objectivity and Citation Health. While super-complex questions may be infrequent in standard applications, Super Research serves as a critical ceiling evaluation and stress test for LLM capabilities. A model's proficiency within Super Research acts as a powerful proxy for its general research competence; success here suggests the robustness necessary to navigate nearly any subordinate research task. Leaderboard is available at: https://cnsdqd-dyb.github.io/Super-Research-Benchmark/
Problem

Research questions and friction points this paper is trying to address.

complex question answering
large language models
autonomous research
evidence synthesis
long-horizon planning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Super Research
structured decomposition
super wide retrieval
super deep investigation
graph-anchored auditing
๐Ÿ”Ž Similar Papers
No similar papers found.