Self-DC: When to Reason and When to Act? Self Divide-and-Conquer for Compositional Unknown Questions

πŸ“… 2024-02-21
πŸ“ˆ Citations: 15
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This paper addresses composite question answering (QA) tasks comprising both known and unknown subproblems, where existing methods lack fine-grained adaptivity in selecting between internal knowledge utilization and external retrieval per subproblem. Method: We propose Self-DC, an adaptive divide-and-conquer framework for large language models (LLMs), featuring meta-prompt-driven subproblem decomposition, dynamic execution path planning, and retrieval-generation co-scheduling. We further introduce CuQAβ€”the first benchmark dataset for composite QA with unknown subproblems. Contribution/Results: Self-DC achieves state-of-the-art or competitive performance on two major benchmarks while significantly reducing external API calls by 38.7% on average. It establishes a novel paradigm of efficient, fine-grained reasoning-retrieval synergy. Its core innovations include a subproblem-level adaptive decision mechanism and a scalable hybrid solving architecture, enabling LLMs to dynamically optimize the balance between generation and retrieval at granular semantic units.

Technology Category

Application Category

πŸ“ Abstract
Previous research has typically concentrated on leveraging the internal knowledge of Large Language Models (LLMs) to answer known questions (i.e., extit{internal reasoning such as generate-then-read}). In contrast, for questions that fall outside their known scope, these models rely on external knowledge retrieval to provide accurate responses (i.e., extit{external acting such as retrieve-then-read}). However, few previous works consider the extit{compositional questions}, which consist of several known and unknown sub-questions, necessitating the dynamic combination of previous two methods (i.e., extit{internal reasoning and external acting}) to achieve a better trade-off between effectiveness and efficiency. To this end, we introduce a extbf{Self} extbf{D}ivide-and- extbf{C}onquer ( extit{ exttt{Self-DC}}) framework, accompanying with the first extbf{C}ompositional extbf{u}nknown extbf{Q}uestion- extbf{A}nswering dataset (CuQA). This framework enables LLMs to adaptively choose between using internal knowledge and retrieving external knowledge as needed, resulting in a better trade-off between effectiveness and efficiency. Experimental results on two datasets demonstrate that extit{ exttt{Self-DC}} can achieve comparable or even better performance with much fewer external calls compared with several strong baselines.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Adaptive Knowledge Utilization
Complex Problem Solving
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-DC
Combination Unknown Questions Dataset (CuQA)
Intelligent Decision-making in Language Models
πŸ”Ž Similar Papers
No similar papers found.
H
Hongru Wang
The Chinese University of Hong Kong, MoE Key Lab of High Confidence Software Technologies, CUHK
Boyang Xue
Boyang Xue
Ph.D. Candidate in The Chinese University of Hong Kong
Natural Language ProcessingLarge Language ModelsSpeech Recognition
B
Baohang Zhou
Nankai University
Tianhua Zhang
Tianhua Zhang
The Chinese University of Hong Kong
natural language processinglarge language models
Cunxiang Wang
Cunxiang Wang
Tsinghua University; ZhipuAI
Large Language ModelsLLM EvaluationLLM Post-training
H
Huimin Wang
Jarvis Research Center, Tencent YouTu Lab
G
Guanhua Chen
Southern University of Science and Technology
K
Kam-Fai Wong
The Chinese University of Hong Kong, MoE Key Lab of High Confidence Software Technologies, CUHK