Adaptive Graph Mixture of Residual Experts: Unsupervised Learning on Diverse Graphs with Heterogeneous Specialization

πŸ“… 2025-10-24
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Graph neural networks (GNNs) suffer from inflexible message-passing architectures, limiting adaptability to diverse graph structures and downstream tasks. Existing graph mixture-of-experts (MoE) approaches rely heavily on supervised signals and exhibit unstable training due to expert heterogeneity. To address these limitations, we propose ADaMoREβ€”a novel unsupervised graph MoE framework. ADaMoRE introduces a backbone-residual expert architecture coupled with a structure-aware gating mechanism and an information-entropy-based diversity regularization, enabling functional specialization of experts and enhanced training stability. The method jointly optimizes a self-supervised graph reconstruction objective and information-theoretic constraints in an end-to-end manner. Evaluated across 16 benchmarks, ADaMoRE achieves state-of-the-art performance in unsupervised node classification and few-shot learning, while demonstrating superior generalization, high training efficiency, and rapid convergence.

Technology Category

Application Category

πŸ“ Abstract
Graph Neural Networks (GNNs) face a fundamental adaptability challenge: their fixed message-passing architectures struggle with the immense diversity of real-world graphs, where optimal computational strategies vary by local structure and task. While Mixture-of-Experts (MoE) offers a promising pathway to adaptability, existing graph MoE methods remain constrained by their reliance on supervised signals and instability when training heterogeneous experts. We introduce ADaMoRE (Adaptive Mixture of Residual Experts), a principled framework that enables robust, fully unsupervised training of heterogeneous MoE on graphs. ADaMoRE employs a backbone-residual expert architecture where foundational encoders provide stability while specialized residual experts capture diverse computational patterns. A structurally-aware gating network performs fine-grained node routing. The entire architecture is trained end-to-end using a unified unsupervised objective, which integrates a primary reconstruction task with an information-theoretic diversity regularizer to explicitly enforce functional specialization among the experts. Theoretical analysis confirms our design improves data efficiency and training stability. Extensive evaluation across 16 benchmarks validates ADaMoRE's state-of-the-art performance in unsupervised node classification and few-shot learning, alongside superior generalization, training efficiency, and faster convergence on diverse graphs and tasks.
Problem

Research questions and friction points this paper is trying to address.

Addressing GNN adaptability challenges with diverse graph structures
Enabling unsupervised training of heterogeneous mixture-of-experts on graphs
Improving training stability and specialization without supervised signals
Innovation

Methods, ideas, or system contributions that make the work stand out.

Backbone-residual expert architecture for graph stability
Structurally-aware gating network for node routing
Unified unsupervised objective with diversity regularization
πŸ”Ž Similar Papers
No similar papers found.
Y
Yunlong Chu
School of New Media and Communication, Tianjin University, Tianjin, China
Minglai Shao
Minglai Shao
Tianjin University
Graph MiningDeep LearningMachine Learning
Zengyi Wo
Zengyi Wo
College of Intelligence and Computing, Tianjin University
Data MiningAnomaly DetectionLLM Reasoning
B
Bing Hao
School of New Media and Communication, Tianjin University, Tianjin, China
Yuhang Liu
Yuhang Liu
The University of Adelaide
Representation LearningLLMsLatent Variable ModelsResponsible AI
R
Ruijie Wang
School of Computer Science and Engineering, Beihang University, Beijing, China
J
Jianxin Li
School of Computer Science and Engineering, Beihang University, Beijing, China