MuseGraph: Graph-oriented Instruction Tuning of Large Language Models for Generic Graph Mining

📅 2024-03-02

🏛️ arXiv.org

📈 Citations: 12

✨ Influential: 0

career value

183K/year

🤖 AI Summary

Graph mining models suffer from poor generalization, requiring task- or dataset-specific training. Method: This paper proposes the first LLM-GNN collaborative framework for universal graph mining, integrating GNN-based structural representation, LLM-powered semantic understanding, chain-of-thought (CoT) reasoning, and structure-aware instruction tuning. Key innovations include: (1) a graph-aware instruction tuning paradigm; (2) an adaptive graph compression and description method; and (3) an LLM-distillation-based CoT instruction generation mechanism with dynamic, cross-task instruction bundle allocation. Contribution/Results: The framework achieves state-of-the-art performance across diverse graph mining tasks—including node classification, graph classification, and link prediction—demonstrating unprecedented cross-dataset and cross-task generalization. It significantly outperforms prior methods, with substantial average accuracy gains on downstream tasks.

Technology Category

Application Category

📝 Abstract

Graphs with abundant attributes are essential in modeling interconnected entities and improving predictions in various real-world applications. Traditional Graph Neural Networks (GNNs), which are commonly used for modeling attributed graphs, need to be re-trained every time when applied to different graph tasks and datasets. Although the emergence of Large Language Models (LLMs) has introduced a new paradigm in natural language processing, the generative potential of LLMs in graph mining remains largely under-explored. To this end, we propose a novel framework MuseGraph, which seamlessly integrates the strengths of GNNs and LLMs and facilitates a more effective and generic approach for graph mining across different tasks and datasets. Specifically, we first introduce a compact graph description via the proposed adaptive input generation to encapsulate key information from the graph under the constraints of language token limitations. Then, we propose a diverse instruction generation mechanism, which distills the reasoning capabilities from LLMs (e.g., GPT-4) to create task-specific Chain-of-Thought-based instruction packages for different graph tasks. Finally, we propose a graph-aware instruction tuning with a dynamic instruction package allocation strategy across tasks and datasets, ensuring the effectiveness and generalization of the training process. Our experimental results demonstrate significant improvements in different graph tasks, showcasing the potential of our MuseGraph in enhancing the accuracy of graph-oriented downstream tasks while keeping the generation powers of LLMs.

Problem

Research questions and friction points this paper is trying to address.

Integrates GNNs and LLMs for generic graph mining tasks

Enables single model handling of diverse graph datasets

Prevents catastrophic forgetting while enhancing LLM generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates GNNs and LLMs into one foundation model

Uses Chain-of-Thought instruction packages for reasoning distillation

Implements graph-aware tuning to prevent catastrophic forgetting

🔎 Similar Papers

LLM-Enhanced User-Item Interactions: Leveraging Edge Information for Optimized Recommendations