Every Little Helps: Building Knowledge Graph Foundation Model with Fine-grained Transferable Multi-modal Tokens

📅 2026-02-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limited generalization of existing knowledge graph reasoning methods in multimodal settings, particularly their poor transferability to unseen graphs. To overcome this challenge, the authors propose TOFU, a foundational model for multimodal knowledge graphs that, for the first time, discretizes structural, visual, and textual information into fine-grained, modality-specific, transferable tokens. TOFU employs a hierarchical fusion architecture and a hybrid message-passing mechanism to enable robust cross-graph reasoning. Evaluated across 17 benchmarks covering transductive, inductive, and fully inductive settings, TOFU substantially outperforms current state-of-the-art methods, demonstrating its exceptional ability to generalize across diverse and previously unseen knowledge graphs.

Technology Category

Application Category

📝 Abstract
Multi-modal knowledge graph reasoning (MMKGR) aims to predict the missing links by exploiting both graph structure information and multi-modal entity contents. Most existing works are designed for a transductive setting, which learns dataset-specific embeddings and struggles to generalize to new KGs. Recent knowledge graph foundation models (KGFMs) improve cross-KG transfer, but they mainly exploit structural patterns and ignore rich multi-modal signals. We address these gaps by proposing a token-based foundation model (TOFU) for MMKGR, which exhibits strong generalization across different MMKGs. TOFU discretizes structural, visual, and textual information into modality-specific tokens. TOFU then employs a hierarchical fusion architecture with mixture-of-message mechanisms, aiming to process these tokens and obtain transferable features for MMKGR. Experimental results on 17 transductive, inductive, and fully-inductive MMKGs show that TOFU consistently outperforms strong KGFM and MMKGR baselines, delivering strong performance on unseen MMKGs.
Problem

Research questions and friction points this paper is trying to address.

multi-modal knowledge graph reasoning
knowledge graph foundation model
cross-KG transfer
generalization
multi-modal signals
Innovation

Methods, ideas, or system contributions that make the work stand out.

token-based foundation model
multi-modal knowledge graph reasoning
transferable representation
hierarchical fusion
modality-specific tokens
🔎 Similar Papers
No similar papers found.