Try, Check and Retry: A Divide-and-Conquer Framework for Boosting Long-context Tool-Calling Performance of LLMs

πŸ“… 2026-03-11
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing approaches struggle to effectively manage a large number of noisy candidate tools in long-context tool-augmented tasks, limiting the practical deployment of large language models. This work proposes Tool-DC, a novel framework that uniquely integrates divide-and-conquer strategy with self-reflection through an iterative β€œtry–check–retry” paradigm to reduce reasoning complexity. Tool-DC features two variants: a training-free version (Tool-DC TF) enabling plug-and-play usability and a trainable version (Tool-DC TB) optimized for efficient inference. Experimental results demonstrate that Tool-DC (TF) achieves an average performance gain of 25.10% on BFCL and ACEBench benchmarks, while Tool-DC (TB) empowers Qwen2.5-7B to match or even surpass closed-source models such as OpenAI o3 and Claude-Haiku-4.5 in tool-calling capability.

Technology Category

Application Category

πŸ“ Abstract
Tool-calling empowers Large Language Models (LLMs) to interact with external environments. However, current methods often struggle to handle massive and noisy candidate tools in long-context tool-calling tasks, limiting their real-world application. To this end, we propose Tool-DC, a Divide-and-Conquer framework for boosting tool-calling performance of LLMs. The core of Tool-DC is to reduce the reasoning difficulty and make full use of self-reflection ability of LLMs via a "Try-Check-Retry" paradigm. Specifically, Tool-DC involves two variants: 1) the training-free Tool-DC (TF), which is plug-and-play and flexible; 2) the training-based Tool-DC (TB), which is more inference-efficient. Extensive experiments show that both Tool-DC methods outperform their counterparts by a clear margin. Tool-DC (TF) brings up to +25.10% average gains against the baseline on BFCL and ACEBench benchmarks, while Tool-DC (TB) enables Qwen2.5-7B to achieve comparable or even better performance than proprietary LLMs, e.g., OpenAI o3 and Claude-Haiku-4.5.
Problem

Research questions and friction points this paper is trying to address.

tool-calling
long-context
noisy candidate tools
LLMs
real-world application
Innovation

Methods, ideas, or system contributions that make the work stand out.

Divide-and-Conquer
Tool Calling
Long-context Reasoning
Self-reflection
Large Language Models
πŸ”Ž Similar Papers
No similar papers found.
K
Kunfeng Chen
School of Computer Science, National Engineering Research Center for Multimedia Software, Institute of Artificial Intelligence and Hubei Key Laboratory of Multimedia and Network Communication Engineering, Wuhan University, China
Qihuang Zhong
Qihuang Zhong
Wuhan University
Large Language ModelsNatural Language Processing
J
Juhua Liu
School of Computer Science, National Engineering Research Center for Multimedia Software, Institute of Artificial Intelligence and Hubei Key Laboratory of Multimedia and Network Communication Engineering, Wuhan University, China
Bo Du
Bo Du
Department of Management, Griffith Business School
Sustainable TransportTravel BehaviourUrban Data AnalyticsLogistics and Supply Chain
Dacheng Tao
Dacheng Tao
Nanyang Technological University
artificial intelligencemachine learningcomputer visionimage processingdata mining