Dynamic Mixture-of-Experts for Incremental Graph Learning

๐Ÿ“… 2025-08-13
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Graph incremental learning suffers from catastrophic forgetting: models readily lose previously acquired knowledge when adapting to newly arriving graph data. Existing approaches preserve historical model behavior holistically, overlooking the heterogeneous transfer value of temporal knowledgeโ€”some patterns facilitate positive transfer to new tasks, while others induce distributional shifts. To address this, we propose Dynamic Expert Mixture (DEMO), the first framework to introduce a time-aware dynamic expert network for graph incremental learning. DEMO employs a temporally weighted regularization loss to differentially constrain knowledge evolution across time steps and incorporates a Top-k sparse gating mechanism to enable efficient expert selection and computational compression. Under the class-incremental setting, DEMO achieves a 4.92% accuracy improvement over the strongest baseline, significantly mitigating catastrophic forgetting while enhancing generalization performance.

Technology Category

Application Category

๐Ÿ“ Abstract
Graph incremental learning is a learning paradigm that aims to adapt trained models to continuously incremented graphs and data over time without the need for retraining on the full dataset. However, regular graph machine learning methods suffer from catastrophic forgetting when applied to incremental learning settings, where previously learned knowledge is overridden by new knowledge. Previous approaches have tried to address this by treating the previously trained model as an inseparable unit and using techniques to maintain old behaviors while learning new knowledge. These approaches, however, do not account for the fact that previously acquired knowledge at different timestamps contributes differently to learning new tasks. Some prior patterns can be transferred to help learn new data, while others may deviate from the new data distribution and be detrimental. To address this, we propose a dynamic mixture-of-experts (DyMoE) approach for incremental learning. Specifically, a DyMoE GNN layer adds new expert networks specialized in modeling the incoming data blocks. We design a customized regularization loss that utilizes data sequence information so existing experts can maintain their ability to solve old tasks while helping the new expert learn the new data effectively. As the number of data blocks grows over time, the computational cost of the full mixture-of-experts (MoE) model increases. To address this, we introduce a sparse MoE approach, where only the top-$k$ most relevant experts make predictions, significantly reducing the computation time. Our model achieved 4.92% relative accuracy increase compared to the best baselines on class incremental learning, showing the model's exceptional power.
Problem

Research questions and friction points this paper is trying to address.

Address catastrophic forgetting in incremental graph learning
Dynamically integrate new expert networks for incoming data
Optimize computational efficiency with sparse mixture-of-experts approach
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic mixture-of-experts for incremental graph learning
Customized regularization loss utilizing data sequence information
Sparse MoE with top-k experts for computational efficiency
๐Ÿ”Ž Similar Papers
No similar papers found.