Adaptive Graph Mixture of Residual Experts: Unsupervised Learning on Diverse Graphs with Heterogeneous Specialization

📅 2025-10-24

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

Graph neural networks (GNNs) suffer from inflexible message-passing architectures, limiting adaptability to diverse graph structures and downstream tasks. Existing graph mixture-of-experts (MoE) approaches rely heavily on supervised signals and exhibit unstable training due to expert heterogeneity. To address these limitations, we propose ADaMoRE—a novel unsupervised graph MoE framework. ADaMoRE introduces a backbone-residual expert architecture coupled with a structure-aware gating mechanism and an information-entropy-based diversity regularization, enabling functional specialization of experts and enhanced training stability. The method jointly optimizes a self-supervised graph reconstruction objective and information-theoretic constraints in an end-to-end manner. Evaluated across 16 benchmarks, ADaMoRE achieves state-of-the-art performance in unsupervised node classification and few-shot learning, while demonstrating superior generalization, high training efficiency, and rapid convergence.

Technology Category

Application Category

📝 Abstract

Graph Neural Networks (GNNs) face a fundamental adaptability challenge: their fixed message-passing architectures struggle with the immense diversity of real-world graphs, where optimal computational strategies vary by local structure and task. While Mixture-of-Experts (MoE) offers a promising pathway to adaptability, existing graph MoE methods remain constrained by their reliance on supervised signals and instability when training heterogeneous experts. We introduce ADaMoRE (Adaptive Mixture of Residual Experts), a principled framework that enables robust, fully unsupervised training of heterogeneous MoE on graphs. ADaMoRE employs a backbone-residual expert architecture where foundational encoders provide stability while specialized residual experts capture diverse computational patterns. A structurally-aware gating network performs fine-grained node routing. The entire architecture is trained end-to-end using a unified unsupervised objective, which integrates a primary reconstruction task with an information-theoretic diversity regularizer to explicitly enforce functional specialization among the experts. Theoretical analysis confirms our design improves data efficiency and training stability. Extensive evaluation across 16 benchmarks validates ADaMoRE's state-of-the-art performance in unsupervised node classification and few-shot learning, alongside superior generalization, training efficiency, and faster convergence on diverse graphs and tasks.

Problem

Research questions and friction points this paper is trying to address.

Addressing GNN adaptability challenges with diverse graph structures

Enabling unsupervised training of heterogeneous mixture-of-experts on graphs

Improving training stability and specialization without supervised signals

Innovation

Methods, ideas, or system contributions that make the work stand out.

Backbone-residual expert architecture for graph stability

Structurally-aware gating network for node routing

Unified unsupervised objective with diversity regularization

🔎 Similar Papers

No similar papers found.