GNNs as Predictors of Agentic Workflow Performances

📅 2025-03-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Evaluating LLM-driven agent workflows is costly and inefficient due to repeated LLM invocations. To address this, we propose the first Graph Neural Network (GNN)-based performance prediction framework: workflows are modeled as computational graphs, and end-to-end learning enables efficient, accurate performance estimation—dramatically reducing LLM calls. Our key contributions are: (1) the first systematic application of GNNs to agent workflow performance prediction; (2) FLORA-Bench, the first unified benchmark covering diverse tasks and architectural variants; and (3) empirical validation showing our lightweight model achieves higher prediction accuracy than baselines while reducing evaluation overhead by several orders of magnitude. All code, models, and data are publicly released.

Technology Category

Application Category

📝 Abstract
Agentic workflows invoked by Large Language Models (LLMs) have achieved remarkable success in handling complex tasks. However, optimizing such workflows is costly and inefficient in real-world applications due to extensive invocations of LLMs. To fill this gap, this position paper formulates agentic workflows as computational graphs and advocates Graph Neural Networks (GNNs) as efficient predictors of agentic workflow performances, avoiding repeated LLM invocations for evaluation. To empirically ground this position, we construct FLORA-Bench, a unified platform for benchmarking GNNs for predicting agentic workflow performances. With extensive experiments, we arrive at the following conclusion: GNNs are simple yet effective predictors. This conclusion supports new applications of GNNs and a novel direction towards automating agentic workflow optimization. All codes, models, and data are available at https://github.com/youngsoul0731/Flora-Bench.
Problem

Research questions and friction points this paper is trying to address.

Optimizing agentic workflows is costly due to frequent LLM invocations.
GNNs proposed as efficient predictors to reduce LLM usage in workflows.
FLORA-Bench benchmarks GNNs for predicting agentic workflow performance.
Innovation

Methods, ideas, or system contributions that make the work stand out.

GNNs predict agentic workflow performances efficiently.
FLORA-Bench benchmarks GNNs for workflow predictions.
GNNs reduce LLM invocations in workflow optimization.
🔎 Similar Papers
No similar papers found.