🤖 AI Summary
This paper addresses the prevalent issues of over-reliance and over-confidence in large language models (LLMs) during tool invocation. To recalibrate models’ awareness of their knowledge boundaries, we propose a multi-objective alignment framework. Methodologically, it introduces: (1) a novel knowledge boundary estimation technique grounded in consistency checking and absolute confidence scoring; and (2) a dynamic decision integration mechanism that jointly leverages probabilistic modeling, supervised fine-tuning, and inference-time intervention. Extensive experiments across diverse scenarios demonstrate that our approach significantly reduces redundant tool calls by 37.2% on average, while preserving task performance. Moreover, it improves response latency and lowers computational cost—achieving, for the first time, simultaneous optimization of reliability, efficiency, and cost-effectiveness in tool-augmented LLMs.
📝 Abstract
Recent advancements in tool learning have enabled large language models (LLMs) to integrate external tools, enhancing their task performance by expanding their knowledge boundaries. However, relying on tools often introduces tradeoffs between performance, speed, and cost, with LLMs sometimes exhibiting overreliance and overconfidence in tool usage. This paper addresses the challenge of aligning LLMs with their knowledge boundaries to make more intelligent decisions about tool invocation. We propose a multi objective alignment framework that combines probabilistic knowledge boundary estimation with dynamic decision making, allowing LLMs to better assess when to invoke tools based on their confidence. Our framework includes two methods for knowledge boundary estimation, consistency based and absolute estimation, and two training strategies for integrating these estimates into the model decision making process. Experimental results on various tool invocation scenarios demonstrate the effectiveness of our framework, showing significant improvements in tool efficiency by reducing unnecessary tool usage.