Enhancing LLM Tool Use with High-quality Instruction Data from Knowledge Graph

📅 2025-06-26

📈 Citations: 0

✨ Influential: 0

career value

152K/year

🤖 AI Summary

To address the limitations of large language models (LLMs) in tool invocation—namely, restricted tool-calling capability and low-quality, semantically impoverished instruction tuning data—this paper proposes a knowledge graph–based method for generating high-quality instruction data. For the first time, it leverages a manually constructed knowledge graph to automatically extract semantically coherent query paths, map them to tool-call sequences, and parse them into structured operational steps, thereby synthesizing semantically rich, logically clear instructional prompts. The approach achieves effective supervised fine-tuning with only a small amount of synthetic data, significantly improving LLMs’ tool selection accuracy and task completion rates across multiple tool-learning benchmarks. Its core contribution is establishing an interpretable, knowledge graph–to–tool-instruction generation paradigm that unifies high data efficiency with strong generalization capability.

Technology Category

Application Category

📝 Abstract

Teaching large language models (LLMs) to use tools is crucial for improving their problem-solving abilities and expanding their applications. However, effectively using tools is challenging because it requires a deep understanding of tool functionalities and user intentions. Previous methods relied mainly on LLMs to generate instruction data, but the quality of these data was often insufficient. In this paper, we propose a new method that uses knowledge graphs to generate high-quality instruction data for LLMs. Knowledge graphs are manually curated datasets rich in semantic information. We begin by extracting various query pathways from a given knowledge graph, which are transformed into a broad spectrum of user queries. We then translate the relationships between entities into actionable tools and parse the pathways of each query into detailed solution steps, thereby creating high-quality instruction data. Our experiments show that fine-tuning on just a small sample of this synthetic data can significantly improve the tool utilization and overall capabilities of LLMs.

Problem

Research questions and friction points this paper is trying to address.

Improving LLM tool use with knowledge graph data

Generating high-quality instruction data for LLMs

Enhancing LLM problem-solving via structured query pathways

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generate instruction data using knowledge graphs

Transform query pathways into user queries

Translate entity relationships into actionable tools

🔎 Similar Papers

Fine-Grained Stateful Knowledge Exploration: A Novel Paradigm for Integrating Knowledge Graphs with Large Language Models