GraphSeek: Next-Generation Graph Analytics with LLMs

📅 2026-02-11

📈 Citations: 0

✨ Influential: 0

career value

162K/year

🤖 AI Summary

This work addresses the inefficiency and poor robustness of current large language models (LLMs) when directly generating graph queries, which hinders non-experts from performing effective natural language analysis on large-scale, heterogeneous, and dynamically evolving property graphs. To overcome this limitation, the authors propose GraphSeek, a novel framework that decouples semantic reasoning from query execution. GraphSeek leverages a semantic catalog to guide the LLM in planning and reasoning, while delegating actual query execution to a deterministic graph query engine—thereby avoiding the generation of fragile, syntactically invalid queries. This approach substantially enhances both the effectiveness and token efficiency of small-context LLMs in complex graph analytics. Experimental results demonstrate that GraphSeek achieves an 86% success rate on complex tasks, significantly outperforming an enhanced LangChain baseline and offering a cost-effective, end-to-end solution for large-scale graph analysis.

Technology Category

Application Category

📝 Abstract

Graphs are foundational across domains but remain hard to use without deep expertise. LLMs promise accessible natural language (NL) graph analytics, yet they fail to process industry-scale property graphs effectively and efficiently: such datasets are large, highly heterogeneous, structurally complex, and evolve dynamically. To address this, we devise a novel abstraction for complex multi-query analytics over such graphs. Its key idea is to replace brittle generation of graph queries directly from NL with planning over a Semantic Catalog that describes both the graph schema and the graph operations. Concretely, this induces a clean separation between a Semantic Plane for LLM planning and broader reasoning, and an Execution Plane for deterministic, database-grade query execution over the full dataset and tool implementations. This design yields substantial gains in both token efficiency and task effectiveness even with small-context LLMs. We use this abstraction as the basis of the first LLM-enhanced graph analytics framework called GraphSeek. GraphSeek achieves substantially higher success rates (e.g., 86% over enhanced LangChain) and points toward the next generation of affordable and accessible graph analytics that unify LLM reasoning with database-grade execution over large and complex property graphs.

Problem

Research questions and friction points this paper is trying to address.

graph analytics

large language models

property graphs

natural language interface

scalability

Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph Analytics

Large Language Models

Semantic Catalog