🤖 AI Summary
Large language models (LLMs) suffer from limited tool-calling capability, relying heavily on static data synthesis and costly human feedback for refinement. Method: This paper proposes Adaptive Self-Refinement (ASR), a feedback-free framework that dynamically optimizes tool usage via four key components: model-capability-aware progressive sample expansion, self-terminating iterative reasoning, dynamic step-length control, and a lightweight self-evaluation module. Contribution/Results: ASR introduces the first “capability-driven” self-refinement mechanism, balancing generalization and efficiency. On multiple benchmarks, it significantly improves accuracy on complex tool-integrated tasks, achieves state-of-the-art performance within only a few reasoning steps, and maintains compatibility across base models of varying scales. ASR establishes a highly efficient, scalable paradigm for LLM tool augmentation without external supervision.
📝 Abstract
Tool learning, which allows Large Language Models (LLMs) to leverage external tools for solving complex user tasks, has emerged as a promising avenue for extending model capabilities. However, current approaches primarily focus on data synthesis for fine-tuning LLMs to invoke tools effectively, largely ignoring how to fully stimulate the potential of the model. In this paper, we propose ToolACE-R, a novel method that introduces adaptive self-refinement for tool invocations. Our approach features a model-aware iterative training procedure that progressively incorporates more training samples based on the model's evolving capabilities. Additionally, it allows LLMs to iteratively refine their tool calls, optimizing performance without requiring external feedback. To further enhance computational efficiency, we integrate an adaptive mechanism when scaling the inference time, enabling the model to autonomously determine when to stop the refinement process. We conduct extensive experiments across several benchmark datasets, showing that ToolACE-R achieves competitive performance compared to advanced API-based models, even without any refinement. Furthermore, its performance can be further improved efficiently through adaptive self-refinement. Our results demonstrate the effectiveness of the proposed method, which is compatible with base models of various sizes, offering a promising direction for more efficient tool learning.