Multiple Heads are Better than One: Mixture of Modality Knowledge Experts for Entity Representation Learning

📅 2024-05-27
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of jointly modeling structural triples and heterogeneous multimodal features in multimodal knowledge graph completion (MMKGC), this paper proposes MoMoK, a relation-guided Mixture of Modality Knowledge Experts framework. MoMoK introduces a novel mixture-of-experts mechanism wherein modality-specific experts are dynamically activated and modulated via relation-conditioned gating, enabling explicit modeling of modality-wise contribution heterogeneity across relations. It integrates multi-expert routing, modality embedding disentanglement, and mutual information minimization to achieve relation-aware adaptive entity representation learning—contrasting with conventional monolithic fusion approaches. Evaluated on four standard MMKG benchmarks, MoMoK consistently outperforms state-of-the-art methods, demonstrating superior robustness and generalization, particularly under complex relational patterns and sparse multimodal inputs.

Technology Category

Application Category

📝 Abstract
Learning high-quality multi-modal entity representations is an important goal of multi-modal knowledge graph (MMKG) representation learning, which can enhance reasoning tasks within the MMKGs, such as MMKG completion (MMKGC). The main challenge is to collaboratively model the structural information concealed in massive triples and the multi-modal features of the entities. Existing methods focus on crafting elegant entity-wise multi-modal fusion strategies, yet they overlook the utilization of multi-perspective features concealed within the modalities under diverse relational contexts. To address this issue, we introduce a novel framework with Mixture of Modality Knowledge experts (MoMoK for short) to learn adaptive multi-modal entity representations for better MMKGC. We design relation-guided modality knowledge experts to acquire relation-aware modality embeddings and integrate the predictions from multi-modalities to achieve joint decisions. Additionally, we disentangle the experts by minimizing their mutual information. Experiments on four public MMKG benchmarks demonstrate the outstanding performance of MoMoK under complex scenarios.
Problem

Research questions and friction points this paper is trying to address.

Learning adaptive multi-modal entity representations for MMKGs.
Modeling structural and multi-modal features collaboratively.
Utilizing multi-perspective features in diverse relational contexts.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixture of Modality Knowledge experts framework
Relation-guided modality knowledge experts
Disentangle experts by minimizing mutual information
🔎 Similar Papers
No similar papers found.
Y
Yichi Zhang
Zhejiang University, Zhejiang University-Ant Group Joint Laboratory of Knowledge Graph
Z
Zhuo Chen
Zhejiang University, Zhejiang University-Ant Group Joint Laboratory of Knowledge Graph
Lingbing Guo
Lingbing Guo
Tianjin University
Machine learningArtificial Intelligence
Y
Yajing Xu
Zhejiang University, Zhejiang University-Ant Group Joint Laboratory of Knowledge Graph
Binbin Hu
Binbin Hu
BUPT & Ant Group
Deep LearningData MiningGraph EmbeddingRecommender System
Z
Ziqi Liu
Ant Group
W
Wen Zhang
Zhejiang University, Zhejiang University-Ant Group Joint Laboratory of Knowledge Graph, Alibaba-Zhejiang University Joint Institute of Frontier Technology
H
Hua-zeng Chen
Zhejiang University, Zhejiang University-Ant Group Joint Laboratory of Knowledge Graph, Alibaba-Zhejiang University Joint Institute of Frontier Technology