Noise-powered Multi-modal Knowledge Graph Representation Framework

📅 2024-03-11

📈 Citations: 1

✨ Influential: 0

career value

185K/year

🤖 AI Summary

To address the brittleness of multimodal knowledge graph (MMKG) representations—leading to inaccurate knowledge embedding and hallucination in large language models—this paper proposes SNAG, a novel framework featuring a modality-level noise masking mechanism that drives robust cross-modal feature fusion within Transformer encoders. SNAG adopts a dual-objective training paradigm jointly optimizing multimodal knowledge graph completion (MKGC) and multimodal entity alignment (MMEA), enabling both standalone deployment and plug-and-play enhancement. Evaluated across ten benchmark datasets, SNAG consistently outperforms state-of-the-art methods, achieving significant gains in completion accuracy and alignment precision. Moreover, it reliably enhances diverse backbone models, demonstrating strong generalizability. By unifying noise-aware representation learning with joint multimodal reasoning, SNAG establishes a scalable, robust paradigm for MMKG representation learning and knowledge injection into large models.

Technology Category

Application Category

📝 Abstract

The rise of Multi-modal Pre-training highlights the necessity for a unified Multi-Modal Knowledge Graph (MMKG) representation learning framework. Such a framework is essential for embedding structured knowledge into multi-modal Large Language Models effectively, alleviating issues like knowledge misconceptions and multi-modal hallucinations. In this work, we explore the efficacy of models in accurately embedding entities within MMKGs through two pivotal tasks: Multi-modal Knowledge Graph Completion (MKGC) and Multi-modal Entity Alignment (MMEA). Building on this foundation, we propose a novel SNAG method that utilizes a Transformer-based architecture equipped with modality-level noise masking to robustly integrate multi-modal entity features in KGs. By incorporating specific training objectives for both MKGC and MMEA, our approach achieves SOTA performance across a total of ten datasets, demonstrating its versatility. Moreover, SNAG can not only function as a standalone model but also enhance other existing methods, providing stable performance improvements. Code and data are available at https://github.com/zjukg/SNAG.

Problem

Research questions and friction points this paper is trying to address.

Multi-modal Knowledge Graphs

Representation Learning

Knowledge Understanding

Innovation

Methods, ideas, or system contributions that make the work stand out.

SNAG

Transformer Architecture

Multi-modal Knowledge Graphs

🔎 Similar Papers

Multiple Heads are Better than One: Mixture of Modality Knowledge Experts for Entity Representation Learning