The Evolution of Tool Use in LLM Agents: From Single-Tool Call to Multi-Tool Orchestration

📅 2026-03-24

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

This work addresses the challenge of coordinating multiple tools with large language models to accomplish long-horizon tasks in complex, dynamic environments—a setting where prior research has largely been confined to single-tool or one-step interactions. We formulate multi-tool coordination as dynamic orchestration over extended trajectories and introduce a unified task definition alongside a structured analytical framework. Through six core dimensions—planning and execution, training methodologies, safety, efficiency, capability completeness, and evaluation benchmarks—we systematically review key techniques, including reasoning-time planning, trajectory-based training, safety-aware control, resource-constrained optimization, and modeling in open-ended environments. By synthesizing insights from software engineering, enterprise workflows, GUI automation, and mobile systems, we delineate the fundamental challenges and future directions for building reliable, scalable, and verifiable multi-tool agents.

Technology Category

Application Category

📝 Abstract

Tool use enables large language models (LLMs) to access external information, invoke software systems, and act in digital environments beyond what can be solved from model parameters alone. Early research mainly studied whether a model could select and execute a correct single tool call. As agent systems evolve, however, the central problem has shifted from isolated invocation to multi-tool orchestration over long trajectories with intermediate state, execution feedback, changing environments, and practical constraints such as safety, cost, and verifiability. We comprehensively review recent progress in multi-tool LLM agents and analyzes the state of the art in this rapidly developing area. First, we unify task formulations and distinguish single-call tool use from long-horizon orchestration. Then, we organize the literature around six core dimensions: inference-time planning and execution, training and trajectory construction, safety and control, efficiency under resource constraints, capability completeness in open environments, and benchmark design and evaluation. We further summarize representative applications in software engineering, enterprise workflows, graphical user interfaces, and mobile systems. Finally, we discuss major challenges and outline future directions for building reliable, scalable, and verifiable multi-tool agents.

Problem

Research questions and friction points this paper is trying to address.

multi-tool orchestration

LLM agents

tool use

long-horizon tasks

agent safety

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-tool orchestration

LLM agents

tool use