Mixture of Experts Meets Decoupled Message Passing: Towards General and Adaptive Node Classification

📅 2024-12-11

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

Graph Neural Networks (GNNs) struggle to model heterogeneous graph structures and capture long-range dependencies, while Graph Transformers alleviate these issues but suffer from poor scalability and limited robustness to noise. To address these limitations, we propose GNNMoE—a novel paradigm for general node classification on heterogeneous graphs—uniquely integrating the Mixture-of-Experts (MoE) mechanism with fine-grained decoupled message passing. We design a soft-hard dual-mode gating mechanism for node-level dynamic expert selection, and introduce adaptive residual connections and an enhanced feed-forward network to improve representation robustness and generalization. Extensive experiments on multiple heterogeneous graph benchmarks demonstrate that GNNMoE significantly outperforms state-of-the-art methods, effectively mitigating over-smoothing and global noise. Moreover, it exhibits strong cross-graph-type adaptability, superior robustness to structural and feature perturbations, and high computational efficiency on large-scale graphs.

Technology Category

Application Category

📝 Abstract

Graph neural networks excel at graph representation learning but struggle with heterophilous data and long-range dependencies. And graph transformers address these issues through self-attention, yet face scalability and noise challenges on large-scale graphs. To overcome these limitations, we propose GNNMoE, a universal model architecture for node classification. This architecture flexibly combines fine-grained message-passing operations with a mixture-of-experts mechanism to build feature encoding blocks. Furthermore, by incorporating soft and hard gating layers to assign the most suitable expert networks to each node, we enhance the model's expressive power and adaptability to different graph types. In addition, we introduce adaptive residual connections and an enhanced FFN module into GNNMoE, further improving the expressiveness of node representation. Extensive experimental results demonstrate that GNNMoE performs exceptionally well across various types of graph data, effectively alleviating the over-smoothing issue and global noise, enhancing model robustness and adaptability, while also ensuring computational efficiency on large-scale graphs.

Problem

Research questions and friction points this paper is trying to address.

Improving node classification on heterophilous graphs

Addressing scalability in large-scale graph transformers

Enhancing model adaptability with mixture-of-experts mechanism

Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines message-passing with Mixture-of-Experts

Uses soft and hard gating layers

Incorporates adaptive residual connections

🔎 Similar Papers

Co-Representation Neural Hypergraph Diffusion for Edge-Dependent Node Classification