A Survey of Large Language Models on Generative Graph Analytics: Query, Learning, and Applications

📅 2024-04-23

🏛️ arXiv.org

📈 Citations: 9

✨ Influential: 0

career value

182K/year

🤖 AI Summary

This paper addresses the inherent limitation of large language models (LLMs) in directly modeling non-sequential graph structures, systematically surveying their state-of-the-art applications in generative graph analysis. We propose the first taxonomy—LLM-GGA—and introduce a dual-track framework comprising LLM-GQP (Graph Query Processing) and LLM-GIL (Graph Inductive Learning), elucidating the paradigm shift from sequential token prediction to graph-aware semantic understanding and generative reasoning. Our methodology integrates knowledge graph–enhanced retrieval, structured prompt engineering, graph representation learning, and multimodal reasoning to establish a unified evaluation methodology. Synthesizing over 100 studies, we categorize prompt design strategies, benchmark datasets, and evaluation protocols; identify core challenges—including structural fidelity, scalability, and grounding—and outline concrete directions for future research. This work provides an authoritative reference and practical framework for advancing the intersection of graph intelligence and foundation models.

Technology Category

Application Category

📝 Abstract

A graph is a fundamental data model to represent various entities and their complex relationships in society and nature, such as social networks, transportation networks, financial networks, and biomedical systems. Recently, large language models (LLMs) have showcased a strong generalization ability to handle various NLP and multi-mode tasks to answer users' arbitrary questions and specific-domain content generation. Compared with graph learning models, LLMs enjoy superior advantages in addressing the challenges of generalizing graph tasks by eliminating the need for training graph learning models and reducing the cost of manual annotation. In this survey, we conduct a comprehensive investigation of existing LLM studies on graph data, which summarizes the relevant graph analytics tasks solved by advanced LLM models and points out the existing remaining challenges and future directions. Specifically, we study the key problems of LLM-based generative graph analytics (LLM-GGA) with three categories: LLM-based graph query processing (LLM-GQP), LLM-based graph inference and learning (LLM-GIL), and graph-LLM-based applications. LLM-GQP focuses on an integration of graph analytics techniques and LLM prompts, including graph understanding and knowledge graph (KG) based augmented retrieval, while LLM-GIL focuses on learning and reasoning over graphs, including graph learning, graph-formed reasoning and graph representation. We summarize the useful prompts incorporated into LLM to handle different graph downstream tasks. Moreover, we give a summary of LLM model evaluation, benchmark datasets/tasks, and a deep pro and cons analysis of LLM models. We also explore open problems and future directions in this exciting interdisciplinary research area of LLMs and graph analytics.

Problem

Research questions and friction points this paper is trying to address.

Adapting LLMs to analyze non-sequential graph data structures

Integrating graph analytics with LLM prompts for query processing

Enhancing graph learning and reasoning using large language models

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based graph query processing with prompt integration

Graph inference and learning via LLM reasoning techniques

Hybrid approaches combining graph analytics with LLMs

🔎 Similar Papers

LLM-Enhanced User-Item Interactions: Leveraging Edge Information for Optimized Recommendations