🤖 AI Summary
Real-world graph data commonly suffer from four fundamental challenges: incompleteness, label/structural imbalance, cross-domain heterogeneity, and dynamic instability—severely undermining the efficacy of conventional graph learning methods. To address these, this work introduces the first unified analytical framework for large language model (LLM)-enhanced graph learning, systematically elucidating how semantic reasoning, external knowledge injection, and context-aware modeling fundamentally augment graph representation learning. Methodologically, we integrate prompt engineering, instruction tuning, graph–text aligned embedding, knowledge distillation, and multimodal graph encoding, synthesizing over one hundred state-of-the-art studies into an open-source literature repository. Our analysis identifies five key open problems and proposes an interdisciplinary evolutionary roadmap. This work establishes both theoretical foundations and practical paradigms for the deep integration of LLMs and graph learning.
📝 Abstract
Graphs are a widely used paradigm for representing non-Euclidean data, with applications ranging from social network analysis to biomolecular prediction. Conventional graph learning approaches typically rely on fixed structural assumptions or fully observed data, limiting their effectiveness in more complex, noisy, or evolving settings. Consequently, real-world graph data often violates the assumptions of traditional graph learning methods, in particular, it leads to four fundamental challenges: (1) Incompleteness, real-world graphs have missing nodes, edges, or attributes; (2) Imbalance, the distribution of the labels of nodes or edges and their structures for real-world graphs are highly skewed; (3) Cross-domain Heterogeneity, graphs from different domains exhibit incompatible feature spaces or structural patterns; and (4) Dynamic Instability, graphs evolve over time in unpredictable ways. Recent advances in Large Language Models (LLMs) offer the potential to tackle these challenges by leveraging rich semantic reasoning and external knowledge. This survey provides a comprehensive review of how LLMs can be integrated with graph learning to address the aforementioned challenges. For each challenge, we review both traditional solutions and modern LLM-driven approaches, highlighting how LLMs contribute unique advantages. Finally, we discuss open research questions and promising future directions in this emerging interdisciplinary field. To support further exploration, we have curated a repository of recent advances on graph learning challenges: https://github.com/limengran98/Awesome-Literature-Graph-Learning-Challenges.