Dynamic Mixture-of-Experts for Incremental Graph Learning

📅 2025-08-13

📈 Citations: 0

✨ Influential: 0

career value

215K/year

🤖 AI Summary

Graph incremental learning suffers from catastrophic forgetting: models readily lose previously acquired knowledge when adapting to newly arriving graph data. Existing approaches preserve historical model behavior holistically, overlooking the heterogeneous transfer value of temporal knowledge—some patterns facilitate positive transfer to new tasks, while others induce distributional shifts. To address this, we propose Dynamic Expert Mixture (DEMO), the first framework to introduce a time-aware dynamic expert network for graph incremental learning. DEMO employs a temporally weighted regularization loss to differentially constrain knowledge evolution across time steps and incorporates a Top-k sparse gating mechanism to enable efficient expert selection and computational compression. Under the class-incremental setting, DEMO achieves a 4.92% accuracy improvement over the strongest baseline, significantly mitigating catastrophic forgetting while enhancing generalization performance.

Technology Category

Application Category

📝 Abstract

Graph incremental learning is a learning paradigm that aims to adapt trained models to continuously incremented graphs and data over time without the need for retraining on the full dataset. However, regular graph machine learning methods suffer from catastrophic forgetting when applied to incremental learning settings, where previously learned knowledge is overridden by new knowledge. Previous approaches have tried to address this by treating the previously trained model as an inseparable unit and using techniques to maintain old behaviors while learning new knowledge. These approaches, however, do not account for the fact that previously acquired knowledge at different timestamps contributes differently to learning new tasks. Some prior patterns can be transferred to help learn new data, while others may deviate from the new data distribution and be detrimental. To address this, we propose a dynamic mixture-of-experts (DyMoE) approach for incremental learning. Specifically, a DyMoE GNN layer adds new expert networks specialized in modeling the incoming data blocks. We design a customized regularization loss that utilizes data sequence information so existing experts can maintain their ability to solve old tasks while helping the new expert learn the new data effectively. As the number of data blocks grows over time, the computational cost of the full mixture-of-experts (MoE) model increases. To address this, we introduce a sparse MoE approach, where only the top-$k$ most relevant experts make predictions, significantly reducing the computation time. Our model achieved 4.92% relative accuracy increase compared to the best baselines on class incremental learning, showing the model's exceptional power.

Problem

Research questions and friction points this paper is trying to address.

Address catastrophic forgetting in incremental graph learning

Dynamically integrate new expert networks for incoming data

Optimize computational efficiency with sparse mixture-of-experts approach

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic mixture-of-experts for incremental graph learning

Customized regularization loss utilizing data sequence information

Sparse MoE with top-k experts for computational efficiency

🔎 Similar Papers

No similar papers found.