A Zero-shot Learning Method Based on Large Language Models for Multi-modal Knowledge Graph Embedding

📅 2025-03-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Zero-shot embedding representation for unseen classes remains challenging in multimodal knowledge graphs (MMKGs). Method: This paper proposes ZSLLM, the first framework to integrate large language models (LLMs) into zero-shot learning for MMKGs. Leveraging prompt engineering—without fine-tuning—ZSLLM activates the LLM’s semantic reasoning capability to enable cross-modal semantic transfer and generate embeddings for unseen entities and relations. It jointly aligns textual and structural modalities by integrating textual prompts with graph-structured knowledge. Results: Evaluated on multiple real-world datasets, ZSLLM achieves substantial improvements over state-of-the-art methods in unseen relation prediction and cross-modal link prediction, with an average 12.7% gain in Hits@1. These results demonstrate the strong generalization capability and practical utility of LLMs for open-domain zero-shot learning in MMKGs.

Technology Category

Application Category

📝 Abstract
Zero-shot learning (ZL) is crucial for tasks involving unseen categories, such as natural language processing, image classification, and cross-lingual transfer. Current applications often fail to accurately infer and handle new relations or entities involving unseen categories, severely limiting their scalability and practicality in open-domain scenarios. ZL learning faces the challenge of effectively transferring semantic information of unseen categories in multi-modal knowledge graph (MMKG) embedding representation learning. In this paper, we propose ZSLLM, a framework for zero-shot embedding learning of MMKGs using large language models (LLMs). We leverage textual modality information of unseen categories as prompts to fully utilize the reasoning capabilities of LLMs, enabling semantic information transfer across different modalities for unseen categories. Through model-based learning, the embedding representation of unseen categories in MMKG is enhanced. Extensive experiments conducted on multiple real-world datasets demonstrate the superiority of our approach compared to state-of-the-art methods.
Problem

Research questions and friction points this paper is trying to address.

Enhance zero-shot learning for unseen categories in multi-modal knowledge graphs.
Transfer semantic information across modalities using large language models.
Improve scalability and practicality in open-domain scenarios.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses large language models for zero-shot learning
Leverages textual prompts for unseen categories
Enhances multi-modal knowledge graph embeddings
🔎 Similar Papers
No similar papers found.
Bingchen Liu
Bingchen Liu
Research Scientist, Character.ai & Playground AI.
Computer VisionMachine LearningDeep Learning
N
Naixing Xu
School of Software, Shandong University, Jinan, China
X
Xin Li
School of Software, Shandong University, Jinan, China