CMKL: Modality-Aware Continual Learning for Evolving Biomedical Knowledge Graphs

📅 2026-05-11

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

This work addresses the challenge of effectively modeling dynamically evolving multimodal biomedical knowledge graphs, where existing knowledge graph embedding methods struggle with catastrophic forgetting and fail to account for modality-specific forgetting dynamics during continual learning. To this end, we propose the CMKL framework, which, for the first time, explicitly models asymmetric forgetting patterns across structural, textual, and molecular representations. CMKL integrates a Mixture-of-Experts routing mechanism for multimodal fusion and incorporates Elastic Weight Consolidation (EWC) regularization alongside a K-means-diversified multimodal replay buffer to mitigate catastrophic forgetting. Evaluated on a continual learning benchmark comprising 129K entities, CMKL achieves an average precision (AP) of 0.591 on entity classification—marking a 60% improvement over prior methods—and obtains an AP of 0.062 on relation prediction, closely approaching the best-performing baseline while substantially outperforming joint training, with a remarkably low forgetting rate of only 0.008.

📝 Abstract

Biomedical knowledge graphs are increasingly large, dynamic, and multimodal, driven by rapid advances in biotechnology such as high-throughput sequencing. Machine learning models can infer previously unobserved biomedical relationships and characterize biomedical entities in these graphs, but existing knowledge graph embedding methods and their continual learning extensions either assume static graph structure or fail to exploit multimodal information under evolving data distributions. They also apply uniform regularization across all model parameters, ignoring that different modalities may exhibit distinct forgetting dynamics as the graph evolves. We propose the Continual Multimodal Knowledge Graph Learner (CMKL), a CL framework for biomedical KGs that natively encodes structure, text, and molecules, fuses them through a Mixture-of-Experts (MoE) router, and protects previously learned knowledge with standard EWC regularization and a K-means-diverse multimodal replay buffer. We evaluate CMKL on a 129K-entity biomedical continual benchmark with 10 tasks. On continual biomedical entity classification, CMKL reaches AP 0.591 versus 0.370 for the strongest structural baseline, a 60% gain that is driven by access to multimodal features and preserved across the sequence with near-zero forgetting (AF 0.008). On continual relationship prediction, CMKL reaches AP $0.062$, matching Naive Sequential and EWC (0.058) within seed noise and outperforming Joint Training (0.047, p=0.045) and LKGE (0.039). A frozen-text ablation reaches AP 0.136, more than double any jointly trained model, yet that signal is unreachable by margin-ranking gradients: the greedy-modality asymmetry lives at the representation level, not the fusion level, and MoE routing manages it by suppressing the unreachable modality without forcing it through a learned bottleneck. Code: github.com/yradwan147/cmkl-neurips2026

Problem

Research questions and friction points this paper is trying to address.

continual learning

multimodal knowledge graph

biomedical knowledge graph

catastrophic forgetting

modality-aware

Innovation

Methods, ideas, or system contributions that make the work stand out.

Continual Learning

Multimodal Knowledge Graph

Mixture-of-Experts