Super Research: Answering Highly Complex Questions with Large Language Models through Super Deep and Super Wide Research

📅 2026-02-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of large language models in tackling highly complex questions that demand long-horizon planning and integration of massive, heterogeneous evidence. To overcome these challenges, the authors propose an autonomous complex question-answering framework that synergistically combines “ultra-wide” and “ultra-deep” research paradigms. The approach employs structured task decomposition, large-scale multi-source retrieval, iterative deep querying, and a graph-anchored auditing protocol to produce verifiable research reports enriched with fine-grained citations and intermediate reasoning artifacts. Evaluated on a benchmark of 300 expert-level complex questions, the system demonstrates the capability to analyze thousands of web pages and perform hundred-step reasoning chains, substantially enhancing answer traceability and multidimensional credibility.

Technology Category

Application Category

📝 Abstract
While Large Language Models (LLMs) have demonstrated proficiency in Deep Research or Wide Search, their capacity to solve highly complex questions-those requiring long-horizon planning, massive evidence gathering, and synthesis across heterogeneous sources-remains largely unexplored. We introduce Super Research, a task for complex autonomous research tasks that integrates (i) structured decomposition into a research plan, (ii) super wide retrieval for diverse perspectives, and (iii) super deep investigation to resolve uncertainties through iterative queries. To evaluate this capability, we curated a benchmark of 300 expert-written questions across diverse domains, each requiring up to 100+ retrieval steps and 1,000+ web pages to reconcile conflicting evidence. Super Research produces verifiable reports with fine-grained citations and intermediate artifacts (e.g., outlines and tables) to ensure traceable reasoning. Furthermore, we present a graph-anchored auditing protocol that evaluates Super Research along five dimensions: Coverage, Logical Consistency, Report Utility, Objectivity and Citation Health. While super-complex questions may be infrequent in standard applications, Super Research serves as a critical ceiling evaluation and stress test for LLM capabilities. A model's proficiency within Super Research acts as a powerful proxy for its general research competence; success here suggests the robustness necessary to navigate nearly any subordinate research task. Leaderboard is available at: https://cnsdqd-dyb.github.io/Super-Research-Benchmark/
Problem

Research questions and friction points this paper is trying to address.

complex question answering
large language models
autonomous research
evidence synthesis
long-horizon planning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Super Research
structured decomposition
super wide retrieval
super deep investigation
graph-anchored auditing
🔎 Similar Papers
No similar papers found.
Y
Yubo Dong
College of Computer Science and Technology
N
Nianhao You
College of Computer Science and Technology
Y
Yuxuan Hou
College of Computer Science and Technology
Z
Zixun Sun
College of Computer Science and Technology
Yue Zhang
Yue Zhang
University of Electronic Science and Technology of China
medical image analysisdeep learningcomputer vision
Hehe Fan
Hehe Fan
Zhejiang University
Deep learningComputer visionMultimediaAI for science
Liang Zhang
Liang Zhang
ant financial
Computational AdvertisingRecommender SystemsInformation Retrieval
S
Siyuan Zhao
Ant Group, Hangzhou, China
L
Linyi
Ant Group, Hangzhou, China