🤖 AI Summary
Current large language model (LLM) agent workflow systems rely on predefined templates and shallow matching, limiting their ability to capture deep semantic relationships and generalize effectively. This work proposes GraphFlow, a novel framework that introduces a unified graph structure—wGraph—as a shared workflow substrate, enabling adaptive, semantics- and constraint-aware dynamic workflow generation. To enhance inference efficiency, GraphFlow incorporates a structure-aware key-value (KV) cache management mechanism. Through the synergistic optimization of graph representation learning, dynamic workflow instantiation, and structured caching, GraphFlow achieves an average performance gain of 4.95 percentage points across five benchmark datasets while reducing memory consumption by approximately fourfold.
📝 Abstract
Large Language Model (LLM)-based agents demonstrate strong reasoning and execution capabilities on complex tasks when guided by structured instructions, commonly referred to as workflows. However, existing workflow-assisted agent serving systems typically rely on predefined templates and shallow matching mechanisms, which limit their ability to capture deep semantic relationships and generalize to previously unseen tasks. To address these limitations, we propose a new workflow management paradigm that represents workflows using a unified graph, termed wGraph, where each node corresponds to an atomic operation. wGraph serves as a shared substrate from which task-specific workflows are dynamically instantiated. Building on wGraph primitives, we introduce GraphFlow, a system that efficiently integrates workflows into agent serving through two key designs. First, adaptive workflow generation dynamically constructs workflows from wGraph based on task semantics and constraint requirements. Second, workflow state management exploits wGraph structure to efficiently manage Key-Value (KV) caches, reducing redundant computation during agent serving. Extensive experiments across five benchmark datasets show that GraphFlow consistently outperforms state-of-the-art methods, yielding an average performance improvement of approximately 4.95 percentage points, while achieving an approximately 4$\times$ reduction in memory footprint.