Multiple Heads are Better than One: Mixture of Modality Knowledge Experts for Entity Representation Learning

📅 2024-05-27

📈 Citations: 1

✨ Influential: 0

career value

188K/year

🤖 AI Summary

To address the challenge of jointly modeling structural triples and heterogeneous multimodal features in multimodal knowledge graph completion (MMKGC), this paper proposes MoMoK, a relation-guided Mixture of Modality Knowledge Experts framework. MoMoK introduces a novel mixture-of-experts mechanism wherein modality-specific experts are dynamically activated and modulated via relation-conditioned gating, enabling explicit modeling of modality-wise contribution heterogeneity across relations. It integrates multi-expert routing, modality embedding disentanglement, and mutual information minimization to achieve relation-aware adaptive entity representation learning—contrasting with conventional monolithic fusion approaches. Evaluated on four standard MMKG benchmarks, MoMoK consistently outperforms state-of-the-art methods, demonstrating superior robustness and generalization, particularly under complex relational patterns and sparse multimodal inputs.

Technology Category

Application Category

📝 Abstract

Learning high-quality multi-modal entity representations is an important goal of multi-modal knowledge graph (MMKG) representation learning, which can enhance reasoning tasks within the MMKGs, such as MMKG completion (MMKGC). The main challenge is to collaboratively model the structural information concealed in massive triples and the multi-modal features of the entities. Existing methods focus on crafting elegant entity-wise multi-modal fusion strategies, yet they overlook the utilization of multi-perspective features concealed within the modalities under diverse relational contexts. To address this issue, we introduce a novel framework with Mixture of Modality Knowledge experts (MoMoK for short) to learn adaptive multi-modal entity representations for better MMKGC. We design relation-guided modality knowledge experts to acquire relation-aware modality embeddings and integrate the predictions from multi-modalities to achieve joint decisions. Additionally, we disentangle the experts by minimizing their mutual information. Experiments on four public MMKG benchmarks demonstrate the outstanding performance of MoMoK under complex scenarios.

Problem

Research questions and friction points this paper is trying to address.

Learning adaptive multi-modal entity representations for MMKGs.

Modeling structural and multi-modal features collaboratively.

Utilizing multi-perspective features in diverse relational contexts.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixture of Modality Knowledge experts framework

Relation-guided modality knowledge experts

Disentangle experts by minimizing mutual information

🔎 Similar Papers

No similar papers found.