Using Large Language Models to Tackle Fundamental Challenges in Graph Learning: A Comprehensive Survey

📅 2025-05-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Real-world graph data commonly suffer from four fundamental challenges: incompleteness, label/structural imbalance, cross-domain heterogeneity, and dynamic instability—severely undermining the efficacy of conventional graph learning methods. To address these, this work introduces the first unified analytical framework for large language model (LLM)-enhanced graph learning, systematically elucidating how semantic reasoning, external knowledge injection, and context-aware modeling fundamentally augment graph representation learning. Methodologically, we integrate prompt engineering, instruction tuning, graph–text aligned embedding, knowledge distillation, and multimodal graph encoding, synthesizing over one hundred state-of-the-art studies into an open-source literature repository. Our analysis identifies five key open problems and proposes an interdisciplinary evolutionary roadmap. This work establishes both theoretical foundations and practical paradigms for the deep integration of LLMs and graph learning.

Technology Category

Application Category

📝 Abstract
Graphs are a widely used paradigm for representing non-Euclidean data, with applications ranging from social network analysis to biomolecular prediction. Conventional graph learning approaches typically rely on fixed structural assumptions or fully observed data, limiting their effectiveness in more complex, noisy, or evolving settings. Consequently, real-world graph data often violates the assumptions of traditional graph learning methods, in particular, it leads to four fundamental challenges: (1) Incompleteness, real-world graphs have missing nodes, edges, or attributes; (2) Imbalance, the distribution of the labels of nodes or edges and their structures for real-world graphs are highly skewed; (3) Cross-domain Heterogeneity, graphs from different domains exhibit incompatible feature spaces or structural patterns; and (4) Dynamic Instability, graphs evolve over time in unpredictable ways. Recent advances in Large Language Models (LLMs) offer the potential to tackle these challenges by leveraging rich semantic reasoning and external knowledge. This survey provides a comprehensive review of how LLMs can be integrated with graph learning to address the aforementioned challenges. For each challenge, we review both traditional solutions and modern LLM-driven approaches, highlighting how LLMs contribute unique advantages. Finally, we discuss open research questions and promising future directions in this emerging interdisciplinary field. To support further exploration, we have curated a repository of recent advances on graph learning challenges: https://github.com/limengran98/Awesome-Literature-Graph-Learning-Challenges.
Problem

Research questions and friction points this paper is trying to address.

Address incompleteness in real-world graphs with missing elements
Mitigate imbalance in skewed graph label distributions
Resolve cross-domain heterogeneity in feature spaces and structures
Innovation

Methods, ideas, or system contributions that make the work stand out.

Leveraging LLMs for incomplete graph data
Addressing imbalance with LLM-driven approaches
Using LLMs to handle dynamic graph instability
🔎 Similar Papers
No similar papers found.
Mengran Li
Mengran Li
Sun Yat-sen University
network scienceheterogeneous graphhypergraph
P
Pengyu Zhang
University of Amsterdam, Amsterdam, The Netherlands
W
Wenbin Xing
Guangdong Key Laboratory of Intelligent Transportation System, School of Intelligent Systems Engineering, Shenzhen Campus of Sun Yat-sen University, Shenzhen, 518107, Guangdong, China
Y
Yijia Zheng
University of Amsterdam, Amsterdam, The Netherlands
Klim Zaporojets
Klim Zaporojets
Postdoctoral researcher, Aarhus University
Natural Language ProcessingInformation ExtractionMachine Learning
J
Junzhou Chen
Guangdong Key Laboratory of Intelligent Transportation System, School of Intelligent Systems Engineering, Shenzhen Campus of Sun Yat-sen University, Shenzhen, 518107, Guangdong, China
R
Ronghui Zhang
Guangdong Key Laboratory of Intelligent Transportation System, School of Intelligent Systems Engineering, Shenzhen Campus of Sun Yat-sen University, Shenzhen, 518107, Guangdong, China
Y
Yong Zhang
Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing, 100124, China
Siyuan Gong
Siyuan Gong
School of Information and Engineering, Chang’an University, Xi’an, 710064, Shaanxi, China
Jia Hu
Jia Hu
University of Exeter
edge-cloud computingresource optimizationsmart citynetwork securityapplied machine learning
Xiaolei Ma
Xiaolei Ma
Professor, Beihang University
Transportation
Z
Zhiyuan Liu
Jiangsu Key Laboratory of Urban ITS, Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies, School of Transportation, Southeast University, Nanjing, 210096, Jiangsu, China
Paul Groth
Paul Groth
Professor, INDE Lab, University of Amsterdam
provenanceinformation integrationweb dataknowledge graphsdata engineering
M
Marcel Worring
University of Amsterdam, Amsterdam, The Netherlands