Budget-Aware Tool-Use Enables Effective Agent Scaling

📅 2025-11-21

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

Tool-augmented large language model (LLM) agents face scalability challenges under explicit tool-call budget constraints—particularly in web search—where blindly increasing the budget fails to improve performance, and existing agents lack budget awareness. Method: This paper presents the first systematic study of agent scalability under budget constraints, introducing the Budget Tracker plugin and the Budget-Aware Tool Selection (BATS) framework. BATS jointly models token consumption and tool invocations via lightweight runtime budget tracking, dynamic planning, and adaptive validation strategy adjustment, enabling unified cost measurement. Contribution/Results: Experiments demonstrate that our budget-aware approach significantly improves the cost-performance Pareto frontier: it increases task completion rates and reasoning efficiency at fixed budgets. The framework establishes a new paradigm for controllable, scalable tool-using agents, offering principled mechanisms for cost-aware decision-making without compromising functional capability.

Technology Category

Application Category

📝 Abstract

Scaling test-time computation improves performance across different tasks on large language models (LLMs), which has also been extended to tool-augmented agents. For these agents, scaling involves not only "thinking" in tokens but also "acting" via tool calls. The number of tool calls directly bounds the agent's interaction with the external environment. However, we find that simply granting agents a larger tool-call budget fails to improve performance, as they lack "budget awareness" and quickly hit a performance ceiling. To address this, we study how to scale such agents effectively under explicit tool-call budgets, focusing on web search agents. We first introduce the Budget Tracker, a lightweight plug-in that provides the agent with continuous budget awareness, enabling simple yet effective scaling. We further develop BATS (Budget Aware Test-time Scaling), an advanced framework that leverages this awareness to dynamically adapt its planning and verification strategy, deciding whether to "dig deeper" on a promising lead or "pivot" to new paths based on remaining resources. To analyze cost-performance scaling in a controlled manner, we formalize a unified cost metric that jointly accounts for token and tool consumption. We provide the first systematic study on budget-constrained agents, showing that budget-aware methods produce more favorable scaling curves and push the cost-performance Pareto frontier. Our work offers empirical insights toward a more transparent and principled understanding of scaling in tool-augmented agents.

Problem

Research questions and friction points this paper is trying to address.

Enabling tool-augmented agents to effectively scale under explicit tool-call budget constraints

Addressing performance limitations from agents lacking budget awareness during tool interactions

Developing systematic methods to optimize cost-performance tradeoffs in budget-constrained agent scaling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Budget Tracker provides continuous budget awareness

BATS framework dynamically adapts planning strategy

Unified cost metric jointly accounts for tokens and tools

🔎 Similar Papers

Foragax: An Agent-Based Modelling Framework Based on JAX