Noise-powered Multi-modal Knowledge Graph Representation Framework

📅 2024-03-11
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
To address the brittleness of multimodal knowledge graph (MMKG) representations—leading to inaccurate knowledge embedding and hallucination in large language models—this paper proposes SNAG, a novel framework featuring a modality-level noise masking mechanism that drives robust cross-modal feature fusion within Transformer encoders. SNAG adopts a dual-objective training paradigm jointly optimizing multimodal knowledge graph completion (MKGC) and multimodal entity alignment (MMEA), enabling both standalone deployment and plug-and-play enhancement. Evaluated across ten benchmark datasets, SNAG consistently outperforms state-of-the-art methods, achieving significant gains in completion accuracy and alignment precision. Moreover, it reliably enhances diverse backbone models, demonstrating strong generalizability. By unifying noise-aware representation learning with joint multimodal reasoning, SNAG establishes a scalable, robust paradigm for MMKG representation learning and knowledge injection into large models.

Technology Category

Application Category

📝 Abstract
The rise of Multi-modal Pre-training highlights the necessity for a unified Multi-Modal Knowledge Graph (MMKG) representation learning framework. Such a framework is essential for embedding structured knowledge into multi-modal Large Language Models effectively, alleviating issues like knowledge misconceptions and multi-modal hallucinations. In this work, we explore the efficacy of models in accurately embedding entities within MMKGs through two pivotal tasks: Multi-modal Knowledge Graph Completion (MKGC) and Multi-modal Entity Alignment (MMEA). Building on this foundation, we propose a novel SNAG method that utilizes a Transformer-based architecture equipped with modality-level noise masking to robustly integrate multi-modal entity features in KGs. By incorporating specific training objectives for both MKGC and MMEA, our approach achieves SOTA performance across a total of ten datasets, demonstrating its versatility. Moreover, SNAG can not only function as a standalone model but also enhance other existing methods, providing stable performance improvements. Code and data are available at https://github.com/zjukg/SNAG.
Problem

Research questions and friction points this paper is trying to address.

Multi-modal Knowledge Graphs
Representation Learning
Knowledge Understanding
Innovation

Methods, ideas, or system contributions that make the work stand out.

SNAG
Transformer Architecture
Multi-modal Knowledge Graphs
🔎 Similar Papers
No similar papers found.
Z
Zhuo Chen
Zhejiang University, ZJU-Ant Group Joint Lab of Knowledge Graph
Yin Fang
Yin Fang
National Institutes of Health
AI4BioinformaticsKnowledge GraphLanguage Model
Y
Yichi Zhang
Zhejiang University, ZJU-Ant Group Joint Lab of Knowledge Graph
Lingbing Guo
Lingbing Guo
Tianjin University
Machine learningArtificial Intelligence
Jiaoyan Chen
Jiaoyan Chen
Department of Computer Science, University of Manchester
Knowledge GraphOntologyMachine LearningLarge Language Model
H
Hua-zeng Chen
Zhejiang University, ZJU-Ant Group Joint Lab of Knowledge Graph
W
Wen Zhang
Zhejiang University, ZJU-Ant Group Joint Lab of Knowledge Graph