The Evolution of Tool Use in LLM Agents: From Single-Tool Call to Multi-Tool Orchestration

📅 2026-03-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of coordinating multiple tools with large language models to accomplish long-horizon tasks in complex, dynamic environments—a setting where prior research has largely been confined to single-tool or one-step interactions. We formulate multi-tool coordination as dynamic orchestration over extended trajectories and introduce a unified task definition alongside a structured analytical framework. Through six core dimensions—planning and execution, training methodologies, safety, efficiency, capability completeness, and evaluation benchmarks—we systematically review key techniques, including reasoning-time planning, trajectory-based training, safety-aware control, resource-constrained optimization, and modeling in open-ended environments. By synthesizing insights from software engineering, enterprise workflows, GUI automation, and mobile systems, we delineate the fundamental challenges and future directions for building reliable, scalable, and verifiable multi-tool agents.

Technology Category

Application Category

📝 Abstract
Tool use enables large language models (LLMs) to access external information, invoke software systems, and act in digital environments beyond what can be solved from model parameters alone. Early research mainly studied whether a model could select and execute a correct single tool call. As agent systems evolve, however, the central problem has shifted from isolated invocation to multi-tool orchestration over long trajectories with intermediate state, execution feedback, changing environments, and practical constraints such as safety, cost, and verifiability. We comprehensively review recent progress in multi-tool LLM agents and analyzes the state of the art in this rapidly developing area. First, we unify task formulations and distinguish single-call tool use from long-horizon orchestration. Then, we organize the literature around six core dimensions: inference-time planning and execution, training and trajectory construction, safety and control, efficiency under resource constraints, capability completeness in open environments, and benchmark design and evaluation. We further summarize representative applications in software engineering, enterprise workflows, graphical user interfaces, and mobile systems. Finally, we discuss major challenges and outline future directions for building reliable, scalable, and verifiable multi-tool agents.
Problem

Research questions and friction points this paper is trying to address.

multi-tool orchestration
LLM agents
tool use
long-horizon tasks
agent safety
Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-tool orchestration
LLM agents
tool use
long-horizon planning
agent safety
🔎 Similar Papers
No similar papers found.
H
Haoyuan Xu
Harbin Institute of Technology, China
C
Chang Li
Harbin Institute of Technology, China
X
Xinyan Ma
Harbin Institute of Technology, China
X
Xianhao Ou
Harbin Institute of Technology, China
Z
Zihan Zhang
Harvard University, USA
Tao He
Tao He
哈尔滨工业大学
Complex ReasoningDialogue Policy PlanningKnowledge Graph Reasoning
X
Xiangyu Liu
Harbin Institute of Technology, China
Zixiang Wang
Zixiang Wang
Peking University
AI for Healthcare
J
Jiafeng Liang
Harbin Institute of Technology, China
Zheng Chu
Zheng Chu
Harbin Institute of Technology
Natural Language Processing
R
Runxuan Liu
Harbin Institute of Technology, China
R
Rongchuan Mu
Harbin Institute of Technology, China
Ming Liu
Ming Liu
Harbin Institute of Technology
Computer VisionDeep Learning
Bing Qin
Bing Qin
Professor in Harbin Institute of Technology
Natural Language ProcessingInformation ExtractionSentiment Analysis