ToolGrad: Efficient Tool-use Dataset Generation with Textual "Gradients"

📅 2025-08-06

📈 Citations: 0

✨ Influential: 0

career value

156K/year

🤖 AI Summary

To address the high annotation failure rate and low efficiency in synthetic tool-use data generation, this paper proposes ToolGrad—a zero-failure, “answer-first” framework for automated data synthesis. ToolGrad fundamentally reverses the conventional generation pipeline: it first constructs valid tool invocation chains via agent-driven iterative chaining and text-gradient-guided search; then, it back-translates each chain into a natural language query that semantically aligns with the execution trace. This inversion guarantees 100% annotation validity while simultaneously enhancing both data complexity and synthesis throughput. Leveraging ToolGrad, we synthesize a high-quality 5K-sample dataset. Models trained on this data consistently outperform open-source and commercial large language models of comparable scale on out-of-distribution tool-use benchmarks—achieving superior performance at significantly lower training cost.

Technology Category

Application Category

📝 Abstract

Prior work synthesizes tool-use LLM datasets by first generating a user query, followed by complex tool-use annotations like DFS. This leads to inevitable annotation failures and low efficiency in data generation. We introduce ToolGrad, an agentic framework that inverts this paradigm. ToolGrad first constructs valid tool-use chains through an iterative process guided by textual "gradients", and then synthesizes corresponding user queries. This "answer-first" approach led to ToolGrad-5k, a dataset generated with more complex tool use, lower cost, and 100% pass rate. Experiments show that models trained on ToolGrad-5k outperform those on expensive baseline datasets and proprietary LLMs, even on OOD benchmarks.

Problem

Research questions and friction points this paper is trying to address.

Inefficient tool-use dataset generation with high failure rates

Need for complex tool-use annotations like DFS in prior methods

Requirement for lower-cost, higher-quality dataset with 100% pass rate

Innovation

Methods, ideas, or system contributions that make the work stand out.

Iterative tool-use chain construction

Textual gradients guide synthesis

Answer-first approach ensures validity

🔎 Similar Papers

No similar papers found.