๐ค AI Summary
Large language models (LLMs) lack native capability to process graph-structured data, hindering their application in graph reasoning tasks.
Method: This paper proposes a semantic-driven graph linearization method that maps graphs into token sequences respecting natural languageโs local dependencies and global alignment properties. We first formally define linearization principles balancing locality and global alignment. Then, we design an interpretable and generalizable node ordering strategy integrating PageRank, degree centrality, and k-core decomposition order, augmented by node relabeling for robustness. Finally, we introduce an end-to-end prompt-tuning and evaluation framework tailored for LLMs.
Contribution/Results: Our method achieves significant improvements over random linearization baselines across multiple graph reasoning benchmarks. Empirical results demonstrate that the proposed representation effectively activates LLMsโ implicit graph cognition capabilities. This work establishes a novel paradigm for unifying graph and textual modeling within multimodal Transformer architectures.
๐ Abstract
Large language models have evolved to process multiple modalities beyond text, such as images and audio, which motivates us to explore how to effectively leverage them for graph reasoning tasks. The key question, therefore, is how to transform graphs into linear sequences of tokens, a process we term"graph linearization", so that LLMs can handle graphs naturally. We consider that graphs should be linearized meaningfully to reflect certain properties of natural language text, such as local dependency and global alignment, in order to ease contemporary LLMs, trained on trillions of textual tokens, better understand graphs. To achieve this, we developed several graph linearization methods based on graph centrality and degeneracy. These methods are further enhanced using node relabeling techniques. The experimental results demonstrate the effectiveness of our methods compared to the random linearization baseline. Our work introduces novel graph representations suitable for LLMs, contributing to the potential integration of graph machine learning with the trend of multimodal processing using a unified transformer model.