GRAFT: Graph-Tokenized LLMs for Tool Planning

📅 2026-05-12

📈 Citations: 0

✨ Influential: 0

career value

172K/year

🤖 AI Summary

Existing approaches struggle to align tool selection with the structural and execution dependencies of subtasks in complex tasks, often producing plans that violate graph constraints and lead to error accumulation. This work proposes a graph-tokenized language model framework that, for the first time, internalizes tool dependency graphs as dedicated graph tokens, explicitly modeling directed execution dependencies within the representation space. By incorporating in-policy tool context distillation and step-level planning signals, the method enables dependency-aware multi-step tool planning. The approach achieves state-of-the-art performance on exact sequence matching and dependency validity metrics, significantly enhancing the reliability of large language models in orchestrating tool invocations within complex workflows.

📝 Abstract

Large language models (LLMs) are increasingly used to complete complex tasks by selecting and coordinating external tools across multiple steps. This requires aligning tool choices with subtask intent while satisfying directional execution dependencies among tools. To do this, existing methods model these dependencies as tool graphs and incorporate the graphs with LLMs through retrieval, serialization, or prompt-level injection. However, these external graph-use strategies all follow a matching paradigm, which often fails to align tool choices with the underlying subtask structure, producing semantically plausible plans that violate graph constraints. This issue is further exacerbated by error accumulation, where an early incorrect tool selection shifts the plan into an invalid graph state and causes subsequent predictions to drift away from the valid execution path. To address these challenges, we propose GRAFT, a graph-tokenized language model framework for dependency-aware tool planning. GRAFT internalizes the tool graph by mapping each tool node to a dedicated special token and learning directed tool dependencies within the representation space. It further introduces on-policy tool context distillation, training the model on its own sampled trajectories while distilling stepwise planning signals. Experiments show that GRAFT achieves state-of-the-art performance in exact sequence matching and dependency legality, supporting more reliable LLM tool planning in complex workflows.

Problem

Research questions and friction points this paper is trying to address.

tool planning

execution dependencies

error accumulation

graph constraints

LLM alignment

Innovation

Methods, ideas, or system contributions that make the work stand out.

graph-tokenized LLMs

dependency-aware planning

tool graph internalization